1919--------------
2020
2121This module provides functions for calculating mathematical statistics of
22- numeric (:class: `Real `-valued) data.
23-
24- .. note ::
25-
26- Unless explicitly noted otherwise, these functions support :class: `int `,
27- :class: `float `, :class: `decimal.Decimal ` and :class: `fractions.Fraction `.
28- Behaviour with other types (whether in the numeric tower or not) is
29- currently unsupported. Collections with a mix of types are also undefined
30- and implementation-dependent. If your input data consists of mixed types,
31- you may be able to use :func: `map ` to ensure a consistent result, for
32- example: ``map(float, input_data) ``.
22+ numeric (:class: `~numbers.Real `-valued) data.
23+
24+ The module is not intended to be a competitor to third-party libraries such
25+ as `NumPy <https://numpy.org >`_, `SciPy <https://www.scipy.org/ >`_, or
26+ proprietary full-featured statistics packages aimed at professional
27+ statisticians such as Minitab, SAS and Matlab. It is aimed at the level of
28+ graphing and scientific calculators.
29+
30+ Unless explicitly noted, these functions support :class: `int `,
31+ :class: `float `, :class: `~decimal.Decimal ` and :class: `~fractions.Fraction `.
32+ Behaviour with other types (whether in the numeric tower or not) is
33+ currently unsupported. Collections with a mix of types are also undefined
34+ and implementation-dependent. If your input data consists of mixed types,
35+ you may be able to use :func: `map ` to ensure a consistent result, for
36+ example: ``map(float, input_data) ``.
3337
3438Averages and measures of central location
3539-----------------------------------------
@@ -107,7 +111,7 @@ However, for reading convenience, most of the examples show sorted sequences.
107111 :func: `median ` and :func: `mode `.
108112
109113 The sample mean gives an unbiased estimate of the true population mean,
110- which means that, taken on average over all the possible samples,
114+ so that when taken on average over all the possible samples,
111115 ``mean(sample) `` converges on the true mean of the entire population. If
112116 *data * represents the entire population rather than a sample, then
113117 ``mean(data) `` is equivalent to calculating the true population mean μ.
@@ -163,8 +167,16 @@ However, for reading convenience, most of the examples show sorted sequences.
163167 will be equivalent to ``3/(1/a + 1/b + 1/c) ``.
164168
165169 The harmonic mean is a type of average, a measure of the central
166- location of the data. It is often appropriate when averaging quantities
167- which are rates or ratios, for example speeds. For example:
170+ location of the data. It is often appropriate when averaging
171+ rates or ratios, for example speeds.
172+
173+ Suppose a car travels 10 km at 40 km/hr, then another 10 km at 60 km/hr.
174+ What is the average speed?
175+
176+ .. doctest ::
177+
178+ >>> harmonic_mean([40 , 60 ])
179+ 48.0
168180
169181 Suppose an investor purchases an equal value of shares in each of
170182 three companies, with P/E (price/earning) ratios of 2.5, 3 and 10.
@@ -175,9 +187,6 @@ However, for reading convenience, most of the examples show sorted sequences.
175187 >>> harmonic_mean([2.5 , 3 , 10 ]) # For an equal investment portfolio.
176188 3.6
177189
178- Using the arithmetic mean would give an average of about 5.167, which
179- is well over the aggregate P/E ratio.
180-
181190 :exc: `StatisticsError ` is raised if *data * is empty, or any element
182191 is less than zero.
183192
@@ -190,9 +199,9 @@ However, for reading convenience, most of the examples show sorted sequences.
190199 middle two" method. If *data * is empty, :exc: `StatisticsError ` is raised.
191200 *data * can be a sequence or iterator.
192201
193- The median is a robust measure of central location, and is less affected by
194- the presence of outliers in your data . When the number of data points is
195- odd, the middle data point is returned:
202+ The median is a robust measure of central location and is less affected by
203+ the presence of outliers. When the number of data points is odd, the
204+ middle data point is returned:
196205
197206 .. doctest ::
198207
@@ -210,13 +219,10 @@ However, for reading convenience, most of the examples show sorted sequences.
210219 This is suited for when your data is discrete, and you don't mind that the
211220 median may not be an actual data point.
212221
213- If your data is ordinal (supports order operations) but not numeric (doesn't
214- support addition), you should use :func: `median_low ` or :func: `median_high `
222+ If the data is ordinal (supports order operations) but not numeric (doesn't
223+ support addition), consider using :func: `median_low ` or :func: `median_high `
215224 instead.
216225
217- .. seealso :: :func:`median_low`, :func:`median_high`, :func:`median_grouped`
218-
219-
220226.. function :: median_low(data)
221227
222228 Return the low median of numeric data. If *data * is empty,
@@ -319,7 +325,7 @@ However, for reading convenience, most of the examples show sorted sequences.
319325 desired instead, use ``min(multimode(data)) `` or ``max(multimode(data)) ``.
320326 If the input *data * is empty, :exc: `StatisticsError ` is raised.
321327
322- ``mode `` assumes discrete data, and returns a single value. This is the
328+ ``mode `` assumes discrete data and returns a single value. This is the
323329 standard treatment of the mode as commonly taught in schools:
324330
325331 .. doctest ::
@@ -522,7 +528,7 @@ However, for reading convenience, most of the examples show sorted sequences.
522528 cut-point will evaluate to ``104 ``.
523529
524530 The *method * for computing quantiles can be varied depending on
525- whether the data in *data * includes or excludes the lowest and
531+ whether the *data * includes or excludes the lowest and
526532 highest possible values from the population.
527533
528534 The default *method * is "exclusive" and is used for data sampled from
0 commit comments