R.J.
Ian Dela Cruz
ELET-2201
Measures of Dispersion
The measures of central tendency are not adequate to describe data. Two data sets can
have the same mean but they can be entirely different. Thus to describe data, one needs to know
the extent of variability. This is given by the measures of dispersion. Range, interquartile range,
and standard deviation are the three commonly used measures of dispersion.
RANGE
The range is the difference between the largest and the smallest observation in the data.
The prime advantage of this measure of dispersion is that it is easy to calculate. On the other
hand, it has lot of disadvantages. It is very sensitive to outliers and does not use all the
observations in a data set. It is more informative to provide the minimum and the maximum
values rather than providing the range.
A range is the most common and easily understandable measure of dispersion. It is the
difference between two extreme observations of the data set. If X max and X min are the two extreme
observations then.
Range = X max – X min
Merits of Range
-It is the simplest of the measure of dispersion
-Easy to calculate
-Easy to understand
-Independent of change of origin
Demerits of Range
-It is based on two extreme observations. Hence, get affected by fluctuations
-A range is not a reliable measure of dispersion
-Dependent on change of scale
INTERQUARTILE RANGE
Interquartile range is defi ned as the difference between the
Hence the interquartile range describes the middle 50% of
observations. If the interquartile range is large it means that
the middle 50% of observations are spaced wide apart. The
important advantage of interquartile range is that it can be used
INTERQUARTILE RANGE
Interquartile range is defined as the difference between the 25th and 75th percentile (also
called the fi rst and third quartile). Hence the interquartile range describes the middle 50% of
observations. If the interquartile range is large it means that the middle 50% of observations are
spaced wide apart. The important advantage of interquartile range is that it can be used as a
measure of variability if the extreme values are not being recorded exactly (as in case of open-
ended class intervals in the frequency distribution). Other advantageous feature is that it is not
affected by extreme values. The main disadvantage in using interquartile range as a measure of
dispersion is that it is not amenable to mathematical manipulation.
STANDARD DEVIATION
Standard deviation (SD) is the most commonly used measure of dispersion. It is a
measure of spread of data about the mean. SD is the square root of sum of squared deviation
from the mean divided by the number of observations.
This formula is a definitional one and for calculations, an easier formula is used. The
computational formula also avoids the rounding errors during calculation.
In both these formulas n - 1 is used instead of n in the denominator, as this produces a
more accurate estimate of population SD.
The reason why SD is a very useful measure of dispersion is that, if the observations are
from a normal distribution, then[3] 68% of observations lie between mean ± 1 SD 95% of
observations lie between mean ± 2 SD and 99.7% of observations lie between mean ± 3 SD
The other advantage of SD is that along with mean it can be used to detect skewness. The
disadvantage of SD is that it is an inappropriate measure of dispersion for skewed data.
A standard deviation is the positive square root of the arithmetic mean of the squares of
the deviations of the given values from their arithmetic mean. It is denoted by a Greek letter
sigma, σ. It is also referred to as root mean square deviation. The standard deviation is given as
σ = [(Σi (yi – ȳ) ⁄ n] ½ = [(Σ i yi 2 ⁄ n) – ȳ 2] ½
For a grouped frequency distribution, it is
σ = [(Σi fi (yi – ȳ) ⁄ N] ½ = [(Σi fi yi 2 ⁄ n) – ȳ 2] ½
The square of the standard deviation is the variance. It is also a measure of dispersion.
σ 2 = [(Σi (yi – ȳ ) / n] ½ = [(Σi yi 2 ⁄ n) – ȳ 2]
For a grouped frequency distribution, it is
σ 2 = [(Σi fi (yi – ȳ ) ⁄ N] ½ = [(Σ i fi xi 2 ⁄ n) – ȳ 2].
If instead of a mean, we choose any other arbitrary number, say A, the standard deviation
becomes the root mean deviation.
Mean Deviation
Mean deviation is the arithmetic mean of the absolute deviations of the observations from
a measure of central tendency. If x1, x2, … , xn are the set of observation, then the mean deviation
of x about the average A (mean, median, or mode) is
Mean deviation from average A = 1⁄n [∑i|xi – A|]
For a grouped frequency, it is calculated as:
Mean deviation from average A = 1⁄N [∑i fi |xi – A|], N = ∑fi
Here, xi and fi are respectively the mid value and the frequency of the ith class interval.
Merits of Mean Deviation
-Based on all observations
-It provides a minimum value when the deviations are taken from the median
-Independent of change of origin
Demerits of Mean Deviation
-Not easily understandable
-Its calculation is not easy and time-consuming
-Dependent on the change of scale
-Ignorance of negative sign creates artificiality and becomes useless for further mathematical
treatment
Find the Variance and Standard Deviation of the Following Numbers: 1, 3, 5, 5, 6, 7, 9, 10.
The mean = 46/ 8 = 5.75
Step 1: (1 – 5.75), (3 – 5.75), (5 – 5.75), (5 – 5.75), (6 – 5.75), (7 – 5.75), (9 – 5.75), (10 – 5.75)
= -4.75, -2.75, -0.75, -0.75, 0.25, 1.25, 3.25, 4.25
Step 2: Squaring the above values we get, 22.563, 7.563, 0.563, 0.563, 0.063, 1.563, 10.563,
18.063
Step 3: 22.563 + 7.563 + 0.563 + 0.563 + 0.063 + 1.563 + 10.563 + 18.063
= 61.504
Step 4: n = 8, therefore variance (σ2) = 61.504/ 8 = 7.69 (3sf)
Now, Standard deviation (σ) = 2.77 (3sf)