From d201ecb7e12ca49d220756e3cdce7fe2f091cc87 Mon Sep 17 00:00:00 2001 From: Nagarjuna Kumar Date: Mon, 14 Aug 2017 13:35:22 +0100 Subject: [PATCH] Fixed typos in tf-idf term weighting section --- doc/modules/feature_extraction.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/modules/feature_extraction.rst b/doc/modules/feature_extraction.rst index 97ec275924c70..1bd1873c4b05e 100644 --- a/doc/modules/feature_extraction.rst +++ b/doc/modules/feature_extraction.rst @@ -490,13 +490,13 @@ log \frac{n_d}{\text{df}(d, t)} + 1 = log(1)+1 = 1` Now, if we repeat this computation for the remaining 2 terms in the document, we get -:math:`\text{tf-idf}_{\text{term2}} = 0 \times log(6/1)+1 = 0` +:math:`\text{tf-idf}_{\text{term2}} = 0 \times (log(6/1)+1) = 0` -:math:`\text{tf-idf}_{\text{term3}} = 1 \times log(6/2)+1 \approx 2.0986` +:math:`\text{tf-idf}_{\text{term3}} = 1 \times (log(6/2)+1) \approx 2.0986` and the vector of raw tf-idfs: -:math:`\text{tf-idf}_raw = [3, 0, 2.0986].` +:math:`\text{tf-idf}_{\text{raw}} = [3, 0, 2.0986].` Then, applying the Euclidean (L2) norm, we obtain the following tf-idfs