Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
1 views17 pages

Central Tendency

The document discusses central tendency in statistics, emphasizing the importance of summarizing data distributions to facilitate comparisons. It outlines measures of central tendency, including arithmetic mean, and presents desirable properties for these measures. Additionally, it provides formulas and results related to the arithmetic mean, including its calculation for ungrouped and grouped data, and the effects of changing the origin and scale.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views17 pages

Central Tendency

The document discusses central tendency in statistics, emphasizing the importance of summarizing data distributions to facilitate comparisons. It outlines measures of central tendency, including arithmetic mean, and presents desirable properties for these measures. Additionally, it provides formulas and results related to the arithmetic mean, including its calculation for ungrouped and grouped data, and the effects of changing the origin and scale.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Ramakrishna Mission residential College, Narendrapur

Department of Statistics

Notes on Central Tendency

Introduction: The condensation of data into a frequency distribution is a first and


necessary step in rendering a long series of observations comprehensible. But for practical
purposes it is not enough, particularly when we want to compare two or more different series.
As a next step we wish to define quantitatively the characteristic of a frequency distribution
in as few members as possible.
Comparison of two distributions seems very difficult if we have to contrast a symmetrical
distribution with a distribution having completely different shapes. (Say, J shaped and U
shaped).
[Note:
i. Unless the shapes of histograms related to different situations are more or less same,
attempt to compare two different situations is meaningless.
ii. If data are coming from similar mechanism, we may expect similar pattern.]
In practice, however, we rarely have to deal such a case since distribution drawn from similar
material are usually of similar form – for example, annual income of middle class people of
two different cities of a country. The practical use of the various statistical quantities which
we shall discuss are based on this fact.
There are two fundamental characteristics in which similar frequency distributions may
differ:
i. They may differ markedly in position, i.e., in the value of the variate around which
they cluster

ii. They may differ in the extent to which the observations are dispersed about the central
value.

Central Tendency: Most of the frequency distributions have a tendency to cluster around
a central value. This property of a frequency distribution is known as central tendency.
Usually bell shaped distributions exhibit this property.
Any measure which can account for such tendency is called a measure of central tendency.
Fundamentally there are two broad types of measures:
i. Moments: based on the distances of the values from a fixed point
ii. Fractiles: based on the position of the values.
Apart, we may use some special kind of measures depending on the nature of the data
reported.

Desiderata of a satisfactory measure of Central Tendency:


Any measure should possess some desirable properties:
a) Should be rigidly defined and should not left to the mere estimation of the observer.
b) Should be based on all the observations. If not it is not really a characteristic of the
whole distribution.
c) Should possess some simple and obvious properties to render its nature readily
comprehensible. Should not be of too abstract in mathematical character.
d) Should be calculated with reasonable ease and rapidity.
e) Should be as little affected as may be possible by what is called fluctuations of
sampling.
f) Should lend itself ready to algebraic treatment.

Different Measure of Central Tendency:


Arithmetic Mean (AM)
1. Basic Data:
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be n observations on some variable x under study. Then simple
arithmetic mean, denoted by 𝑥̅ , is given by
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 1
𝑥̅ = = ∑𝑛𝑖=1 𝑥𝑖 ------- (1)
𝑛𝑜.𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑛
[We want a representative value of the entire data set which will be around the central
position. If ‘c’ be that aimed at central value and we want a moment measure (i.e.,
based on the distances of the values from a fixed point), then it is logical to think that
the sum of the distances of the observations below ‘c’ will be equals that above ‘c’.
Thus
∑𝑥𝑖 ≤𝑐(𝑐 − 𝑥𝑖 ) = ∑𝑥𝑖>𝑐(𝑥𝑖 − 𝑐 )
1
=> ∑𝑛𝑖=1 𝑥𝑖 = 𝑛𝑐 => 𝑐 = ∑𝑛𝑖=1 𝑥𝑖 ]
𝑛
2. Ungrouped Frequency Distribution:
In case of an ungrouped frequency distribution of the form
x f
𝑥1 𝑓1
𝑥2 𝑓2
⋮ ⋮
𝑥𝑖 𝑓𝑖
⋮ ⋮
𝑥𝑘 𝑓𝑘
Total N
The formula for arithmetic mean will be
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑡ℎ𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 1
𝑥̅ = = ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 -------- (2)
𝑛𝑜.𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑛

(2) can also be written as


𝑓𝑖
𝑥̅ = ∑𝑘𝑖=1 𝑥𝑖 = ∑𝑘𝑖=1 𝑤𝑖 𝑥𝑖 (𝑠𝑎𝑦)
𝑛
𝑓𝑖
where, 𝑤𝑖 = is the relative frequency of 𝑥𝑖 .
𝑛

[Note:
i. 𝑤𝑖 may be interpreted as the relative importance of 𝑥𝑖 in the data set. So this
formula is often called the formula for weighted mean.
1
ii. 𝐼𝑓 𝑤𝑖 = 𝑛 ∀ 𝑖 we get formula for simple AM (given by (1)), where equal weights
have been given to each value.
iii. In case of grouped frequency distribution, in the formula given by (2), we
take 𝑥𝑖 as the class mark of the ith class to get an approximate AM of the data set.

Result-1: [effect of change of origin/base and unit/scale on arithmetic mean]


Let y be a variable, which is obtained from a variable x by changing the origin (from 0 to ‘a’)
and unit/scale (‘b’ is taken as unit) so that
𝑥−𝑎
𝑦= ⇒ 𝑥 = 𝑎 + 𝑏𝑦
𝑏
Then, the arithmetic mean of y will be
𝑥̅ −𝑎
𝑦̅ = 𝑏 ----------- (3)
Proof: Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be n observations on some variable x under study.
Since, a new variable y is related to x as 𝑥 = 𝑎 + 𝑏𝑦,
then for ith observation, we have
𝑥𝑖 = 𝑎 + 𝑏𝑦𝑖
Summing over all i and dividing by n, we have
1 𝑛 1 1
∑ 𝑥 = ∑𝑛𝑖=1(𝑎 + 𝑏𝑦𝑖 ) = (𝑛𝑎 + 𝑏 ∑𝑛𝑖=1 𝑦𝑖 )
𝑛 𝑖=1 𝑖 𝑛 𝑛
𝑛𝑎 1
=( + 𝑏 ∑𝑛𝑖=1 𝑦𝑖 ) = 𝑎 + 𝑏𝑦̅
𝑛 𝑛
𝑥̅ −𝑎
=> 𝑦̅ = 𝑏
[This result is very useful to reduce computational labour of arithmetic mean.]

Result-2: Let there be two sets of observations with respective sizes 𝑛1 𝑎𝑛𝑑 𝑛2 with
respective arithmetic means 𝑥̅1 𝑎𝑛𝑑 𝑥̅ 2. Then the arithmetic mean for the combined set of
𝑛1 + 𝑛2 observations is given by
𝑛1 𝑥̅1 + 𝑛2𝑥̅2
𝑥̅ 𝑐 = ------------------- (4)
𝑛1 + 𝑛2
Proof: Let the two sets of observations be
𝑥11 , 𝑥12 , ⋯ , 𝑥1𝑖 , ⋯ , 𝑥1𝑛1 𝑎𝑛𝑑 𝑥21 , 𝑥22 , ⋯ , 𝑥2𝑗 , ⋯ , 𝑥2𝑛2
Then
1
𝑥̅1 = ∑𝑛𝑖=1
1
𝑥1𝑖 => ∑𝑖=1
𝑛1
𝑥1𝑖 = 𝑛1 𝑥̅1 𝑎𝑛𝑑
𝑛1
1
𝑥̅ 2 = ∑𝑛𝑗=1
2 𝑛2
𝑥2𝑗 => ∑𝑗=1 𝑥2𝑗 = 𝑛2 𝑥̅ 2
𝑛2
Now, the arithmetic mean for the combined set of 𝑛1 + 𝑛2 observations will be
𝑛
1 𝑥 +∑ 2 𝑥 𝑛
∑𝑖=1 1𝑖 𝑗=1 2𝑗 𝑛1 𝑥̅1 + 𝑛2𝑥̅2
𝑥̅ 𝑐 = =
𝑛1 + 𝑛2 𝑛1 + 𝑛2
Result-3: If 𝑥̅1 𝑎𝑛𝑑 𝑥̅ 2 be respectively the arithmetic means of two sets of observations
with respective sizes 𝑛1 𝑎𝑛𝑑 𝑛2 and 𝑥̅ 𝑐 be the arithmetic mean for the combined set of 𝑛1 +
𝑛2 observations, then 𝑥̅ 𝑐 lies between 𝑥̅1 𝑎𝑛𝑑 𝑥̅ 2.
Proof: Without any loss of generality, let us assume that 𝑥̅1 ≤ 𝑥̅ 2 .
Then we need to prove that 𝑥̅1 ≤ 𝑥̅ 𝑐 ≤ 𝑥̅ 2 .
𝑛1 𝑥̅1+ 𝑛2 𝑥̅2
We have 𝑥̅ 𝑐 = [we need to prove this as result-2]
𝑛1 + 𝑛2
Since 𝑥̅1 ≤ 𝑥̅ 2 (by assumption)
𝑛1 𝑥̅1 + 𝑛2𝑥̅2 𝑛1 𝑥̅2 + 𝑛2 𝑥̅2
𝑥̅ 𝑐 = ≤ = 𝑥̅ 2 ----------- (5)
𝑛1 + 𝑛2 𝑛1 + 𝑛2
𝑛1 𝑥̅1 + 𝑛2 𝑥̅2 𝑛1 𝑥̅1 + 𝑛2 𝑥̅1
And 𝑥̅ 𝑐 = ≥ = 𝑥̅1 ------------ (6)
𝑛1 + 𝑛2 𝑛1 + 𝑛2
Combining (5) and (6), we have
𝑥̅1 ≤ 𝑥̅ 𝑐 ≤ 𝑥̅ 2 . ------------------ (7)

Result-3: If 𝑧 = 𝑎𝑥 + 𝑏𝑦, then 𝑧̅ = 𝑎𝑥̅ + 𝑏𝑦̅


[Do it yourself]
Result-4: Let x be a variable taking values 1,2, ⋯ , 𝑘 and let 𝐹′1 = 𝑛, 𝐹′2 , ⋯ , 𝐹′𝑘 be the
corresponding cumulative frequency of ‘greater than type. Then
1
𝑥̅ = 𝑛 ∑𝑘𝑖=1 𝐹′𝑖

Proof: Let 𝑓𝑖 be the frequency for 𝑥𝑖 = 𝑖, 𝑓𝑜𝑟 𝑖 = 1,2, ⋯ , 𝑘


Since 𝐹′𝑖 , 𝑖 = 1,2, ⋯ , 𝑘 are the cumulative frequency of ‘greater than type’, we have
1 𝑘 1
∑𝑖=1 𝐹′𝑖 = ∑𝑘𝑖=1 ∑𝑘𝑟=𝑖 𝑓𝑟
𝑛 𝑛
1
= [𝑓1 + 𝑓2 + ⋯ + 𝑓𝑘−1 + 𝑓𝑘
𝑛
+𝑓2 + ⋯ + 𝑓𝑘−1 + 𝑓𝑘
+⋯ +𝑓𝑘−1 + 𝑓𝑘
+𝑓𝑘 ]
1
= 1[1. 𝑓 + 2. 𝑓2 + ⋯ + (𝑘 − 1)𝑓𝑘−1 + 𝑘. 𝑓𝑘 ]
𝑛
1 𝑘 1
= ∑ 𝑖𝑓 = 𝑛 ∑𝑘𝑖=1 𝑥𝑖 𝑓𝑖 = 𝑥̅
𝑛 𝑖=1 𝑖

Median (Positional Mean):


Consider an example of data on family incomes of 11 families (in ₹ ‘000):
12, 9, 15, 8, 7, 14, 11, 10, 8, 5, 120
In such a situation if we take the measure of central tendency as A.M, that won’t be a good
representative of the data set.
Note: In the presence of extreme values, moment measure like A.M is not appropriate. In such
cases, one of the followings may be done:
i. Extreme values may be ignored and mean may be calculated based on the remaining
observations. (Here we need to decide upon the criterion for marking the extreme
values – one tool to decide is Box plot which will be discussed later).
ii. Fractile measures – measures based on relative position. [pth fractile is that
observation below which 100p% of the total number of observations lie.]

Median is such a fractile measure.


After arranging all the observations in increasing order, we search for a value such that the
number of observation below the value is equal to the number of observations above that value.
This value is the median of the entire data set.
1. Basic Data:
Let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be n observations on some variable x under study. After arranging in non-
decreasing order, we have,
𝑥(1) ≤ 𝑥(2) ≤ ⋯ ≤ 𝑥(𝑛)

𝐼𝑓 𝑛 𝑖𝑠 𝑜𝑑𝑑, 𝑡ℎ𝑒𝑛, 𝑥(𝑛+1) = 𝑥̃ is the median.


2

If n is even, we have two middle most values 𝑥(𝑛) 𝑎𝑛𝑑 𝑥(𝑛+1) .


2 2

In that case, median is not rigidly defined as any value between 𝑥(𝑛) 𝑎𝑛𝑑 𝑥(𝑛+1) may be
2 2
𝑥 𝑛 +𝑥 𝑛
( ) ( +1)
2 2
taken as median. For uniqueness we take, 𝑥̃ = .
2

2. Ungrouped Frequency Distribution:


In case of an ungrouped frequency distribution, we construct less-than equal to type (or more-
than equal to type ) cumulative frequency column:
x frequency c.f (≤ type)
𝑥1 𝑓1 𝐹1
𝑥2 𝑓2 𝐹2
⋮ ⋮ ⋮
𝑥𝑚−1 𝑓𝑚−1 𝐹𝑚−1
𝑥𝑚 𝑓𝑚 𝐹𝑚
⋮ ⋮ ⋮
𝑥𝑘 𝑓𝑘 𝐹𝑘

𝑛 (𝑛+1) 𝑛
We find m such that 𝐹𝑚−1 < or AND 𝐹𝑚 ≥ 2, then 𝑥𝑚 is the median.
2 2
3. Grouped Frequency Distribution:
In case of grouped frequency distribution, using cumulative frequencies we can locate the
class that includes the median. It is referred to as the median-class; let us denote it by [𝑥𝑚𝐿 −
𝑥𝑚𝑈 ].
𝑥 +𝑥
A crude approximation to the median may be taken as 𝑥̃ = 𝑚𝐿 2 𝑚𝑈
To get an improved formula, let us look at the segment of less than equal to type ogive with
in median class.
If we can assume that the frequencies are uniformly distributed over the median class, then
we can use linear interpolation.
[Please recall that ≤ type cumulative frequency of a class corresponds to its upper
boundary so that, we have

class Upper boundary of c.f (≤ type)


Class
m-1 𝑥𝑚−1𝑈 =𝑥𝑚𝐿 𝐹𝑚−1
𝑥̃(𝑢𝑛𝑘𝑛𝑜𝑤𝑛) 𝑛
2
m 𝑥𝑚𝑈 𝐹𝑚
]
Then, using linear interpolation,
𝑛
𝑥̃ − 𝑥𝑚𝐿 − 𝐹𝑚−1
= 2
𝑥𝑚𝑈 − 𝑥𝑚𝐿 𝐹𝑚 − 𝐹𝑚−1
𝑛
−𝐹𝑚−1
2
⇒ 𝑥̃ = 𝑥𝑚𝐿 + . 𝑤𝑚 ------------------------ (8)
𝑓𝑚

Where, 𝑥𝑚𝐿 and 𝑥𝑚𝑈 are the lower and the upper boundaries respectively of the median
class,
n is the total frequency,
𝑓𝑚 = 𝐹𝑚 − 𝐹𝑚−1 𝑖𝑠 the frequency of the median class,
𝑤𝑚 = 𝑥𝑚𝑈 − 𝑥𝑚𝐿 is the width of the median class,
𝐹𝑚−1 is the cumulative frequency of the class preceding the median class.
Note that: Neither we require classes of equal width, nor uniformity outside the median class.

Result: Effect of change of origin and scale on Median


If y = ax + b and median of x be 𝑥𝑚 , then median of y will be (a𝑥𝑚 + b).
Proof:
Case 1: a>0
𝑥(1) ≤ 𝑥(2) ≤ ⋯ ≤ 𝑥(𝑛) gives
a𝑥(1) + 𝑏 ≤ 𝑎𝑥(2) + 𝑏 ≤ ⋯ ≤ 𝑎𝑥(𝑛) + 𝑏

i.e, the relative positions of y values will be same as the corresponding x values. Thus, the
median of y will be the value of y corresponding to the median of x,
i.e, 𝑦𝑚 = 𝑎𝑥𝑚 + 𝑏.

Case 2: a<0
Relative position of the y values will be the reverse of that of the corresponding x values,
i.e, a𝑥(1) + 𝑏 ≥ 𝑎𝑥(2) + 𝑏 ≥ ⋯ ≥ 𝑎𝑥(𝑛) + 𝑏.

Hence, in this case too, the median of y will be the value of y corresponding to the median
of x, i.e, 𝑦𝑚 = 𝑎𝑥𝑚 + 𝑏. -------------------- (9)
Result: The abscissa of the point of intersection of the two types of ogives gives the
median.
Proof: The less-than equal to type ogive and the greater-than equal to type ogive
respectively are, by nature, monotonically increasing and monotonically decreasing over
the same range of variation of the variable.. Hence, they are expected to intersect.

I(𝑥0 𝑦0)
≥ 𝑡𝑦𝑝𝑒 𝑜𝑔𝑖𝑣𝑒
≤ 𝑡𝑦𝑝𝑒 𝑜𝑔𝑖𝑣𝑒
Let I(𝑥0 , 𝑦0 ) be the point of intersection of the two ogives.
Then, I is a point on the “≤” Type ogive.
⇒ 𝑦𝑜 number of observations have values ≤ 𝑥0 .
Again, I is a point on the “≥” Type ogive.
⇒ 𝑦0 observations have values ≥ 𝑥0 .
∴ No.of observations with values ≤ 𝑥0 = No. of observations with values ≥ 𝑥0
Thus, by definition, 𝑥0 is the median. But 𝑥0 is the point of intersection of the two ogives.
Hence the result.
Result: Let there be two sets of observations with respective sizes 𝑛1 𝑎𝑛𝑑 𝑛2 , respective
medians 𝑀1 & 𝑀2 . Let M be the median of the combined set. Then, M lies between 𝑀1 𝑎𝑛𝑑 𝑀2.
Proof: Without any loss of generality, we assume that 𝑀1 ≤ 𝑀2.
[And for simplicity we take both 𝑛1 𝑎𝑛𝑑 𝑛2 to be even.]
Since 𝑀1 is the median of the first dataset with 𝑛1 observations,
𝑛1
number of observations of the first set are 𝑏𝑒𝑙𝑜𝑤 𝑀1 .
2

Let 𝑛0 be the number of observations of the second set which are 𝑏𝑒𝑙𝑜𝑤 𝑀1 . ----- (10)
𝑛
∴ ( 21 + 𝑛0 ) number of observations in total (combined set) are 𝑏𝑒𝑙𝑜𝑤𝑀1 ----- (11)

𝑀2 is the median of the 2nd dataset with 𝑛2 observations.


𝑛2
Then, observations of the 2nd set are below 𝑀2 . ---- (12)
2

By assumption, since 𝑀1 ≤ 𝑀2 ,
So from (10) and (12),
𝑛2
𝑛0 ≤
2
𝑛1 𝑛1+𝑛2
⇒ 𝑛0 + ≤ = number of observations of the combined set that are ≤ 𝑀.
2 2

∴ number of observations below 𝑀1 ≤ number of observations below M.


⇒ 𝑀1 ≤ 𝑀 ----- (13)
𝑛2
Again, observations of the 2nd set lie above 𝑀2 .
2

Let k observations of the first set also lie above 𝑀2 .


𝑛
Then, ( 22 + 𝑘) observations of the combined set lie above 𝑀2 .

However, by assumption, 𝑀1 ≤ 𝑀2 .
𝑛1
⇒ k≤ 2
𝑛1
[Since k observations of the first set lie above 𝑀2 and number of observations of the first
2
set are above 𝑀1 ]
𝑛2 𝑛1 + 𝑛2
⇒ +𝑘 ≤
2 2
⇒ Number of observations above 𝑀2 ≤ Number of observations above M.
⇒ M ≤ 𝑀2 ---- (14)
Combining (13) & (14) we get, 𝑀1 ≤ 𝑀 ≤ 𝑀2 ---- (15)

Geometric Mean:
The Geometric Mean of n observations 𝑥1 , 𝑥2 , … , 𝑥𝑛 on some variable x is defined as,
1⁄
𝑥𝑔 = (∏𝑛𝑖=1 𝑥𝑖 ) 𝑛
----------- (16)
1
⇒ log 𝑥𝑔 = ∑𝑛𝑖=1 log 𝑥𝑖 --------- (17)
𝑛

Note:
i. The logarithm of geometric mean of a variable is the arithmetic mean of its logarithm.
ii. If the variable relates to the rate and ratios (i.e., the observations are in multiplicative
mode, rather than additive mode), then geometric mean is an appropriate measure of
average.
iii. The above statement may be justified by the following result
𝑥
Result: If 𝑧 = 𝑦 , and the geometric means of x, y and z respectively are
𝑥𝑔
𝑥𝑔 , 𝑦𝑔 𝑎𝑛𝑑 𝑧𝑔 , then 𝑧𝑔 = , i.e., the geometric mean of the ratio of the two
𝑦𝑔
variables is the ratio of their geometric means.
𝑥
Proof: For ih unit, If 𝑧𝑖 = 𝑦𝑖 𝑓𝑜𝑟 𝑖 = 1,2, ⋯ , 𝑛
𝑖
1⁄ 𝑥 1⁄𝑛
By definition, 𝑧𝑔 = (∏𝑛𝑖=1 𝑧𝑖 ) 𝑛 = (∏𝑛𝑖=1 𝑦𝑖 )
𝑖
1⁄
(∏𝑛
𝑖=1 𝑥𝑖 )
𝑛 𝑥𝑔
= 1⁄ = ----------------------- (18)
(∏𝑛 𝑛 𝑦𝑔
𝑖=1 𝑦𝑖 )
iv. Some statisticians consider the geometric mean to be the natural form of average for
averaging the price relative (relative change in price of a commodity). The reason
may be described as follows:
1
Suppose the price relative of two commodities, say A and B, be respectively 5 𝑎𝑛𝑑 5
1
[i.e., new price of A is 5 time its old price, while for B it is 5 times.] If equal
importance may be assigned to each price relative then on an average there will be no
change in the overall price level meaning that the average price relative is 1, which is
1
nothing but the geometric mean of 5 𝑎𝑛𝑑 5, the price relatives of A and B
respectively
v. Let us consider a data set like
𝑥1 ≈ 𝑎𝑐, 𝑥2 ≈ 𝑎𝑐 2 , ⋯ , 𝑥𝑖 ≈ 𝑎𝑐 𝑖 , ⋯ , 𝑥𝑛 ≈ 𝑎𝑐 𝑛 -------- (19)
Note that the powers of ‘c’ are first n natural numbers. Then we can get a central value
𝑛+1
of the data set when ‘c’ has a power = a central value of 1, 2,---, n = 2
This can be achieved approximately by computing geometric mean given by (16)
[That is why geometric mean comes in if one wants to compute the value at the mid-
point of a time interval when the variable changes over time exponentially.]
vi. How can you justify (17) and from that (16)?
[For the data like (19), log of the observations will be approximately in AP and
naturally for the transformed data set arithmetic mean will be an appropriate
measure of central tendency. Also log function is monotonic. Thus the formula.]
vii. Geometric mean makes sense when all the values of the variable are strictly positive.
viii. Because of its abstract nature and computational labour, geometric mean is not of
common use in statistical work.
ix. For a frequency distribution,
1⁄
𝑓 ∑ 𝑓𝑖
𝑥𝑔 = (∏𝑘𝑖=1 𝑥𝑖 𝑖 ) -------- (20)
1
⇒ log 𝑥𝑔 = ∑𝑘𝑖=1 𝑓𝑖 log 𝑥𝑖 --------- (21)
𝑛

Question: Let there be two sets of observations with respective sizes 𝑛1 & 𝑛2 and respective
geometric means 𝐺1 & 𝐺2. What will be the G.M of the combined set?
Hint: set 1: 𝑥11 , 𝑥12 , … , 𝑥1𝑛1

Set 2: 𝑥21 , 𝑥22 , … , 𝑥2𝑛2

𝑛 1⁄
𝑛1
G.M of the 1st set: 𝐺1 = (∏𝑖=1
1
𝑥1𝑖 )

𝑛 1⁄
𝑛2
G.M of the 2nd set: 𝐺2 = (∏𝑗=1
2
𝑥2𝑗 )

Then, by definition, G.M of the combined set will be


1 1
𝑛 𝑛
𝐺𝑐 = (∏𝑖=1
1
𝑥1𝑖 ∏𝑗=1
2
𝑥2𝑗 )𝑛1 +𝑛2 = (𝐺1 𝑛1 𝐺2 𝑛2 )𝑛1 +𝑛2 -------- (22)

Generalisation: If there are k sets with respective sizes 𝑛1 , 𝑛2 , … , 𝑛𝑘 and respective G.Ms Let
𝐺1 , 𝐺2 , … , 𝐺. Then, G.M for the combined set will be,
1
⁄∑𝑘 𝑛
𝐺𝑐 = (∏𝑘𝑖=1 𝐺 𝑛𝑖 𝑖 ) 𝑖=1 𝑖 -------------------- (23)
Question: 𝐺1 is a representative of the 𝑛1 observations of the 1st set.
𝐺2 is a representative of the 𝑛2 observations of the 2nd set.
𝑛1 𝐺1 +𝑛2 𝐺2
If one takes to be a representative of the entire dataset, what would be your opinion
𝑛1 +𝑛2
about that measure?

Result: If y=ax, with a>0, then, 𝑔𝑦 =𝑎𝑔𝑥 .

Some observations:
i. G.M is rigidly defined but requires 𝑥𝑖 > 0 ∀ 𝑖.
ii. With respect to sampling fluctuations G.M is preferred to A.M.
iii. Further algebraic treatment is easy.
iv. Interpretation of G.M is not as simple as that of A.M.
v. Computational difficulty of G.M is more than that of A.M.
Problem:
For a frequency distribution the upper class boundary bears a constant ratio r to the lower class
boundary. If 𝑥𝑖 𝑎𝑛𝑑 𝑓𝑖 , 𝑓𝑜𝑟 𝑖 = 1,2, ⋯ , 𝑘, be respectively the class mark and the frequency of
the ith class and G be the geometric mean of the distribution, show that
log 𝑟
log 𝐺 = log 𝑥1 + ∑𝑘𝑖=1(𝑖 − 1)𝑓𝑖 , 𝑤ℎ𝑒𝑟𝑒 𝑛 = ∑𝑘𝑖=1 𝑓𝑖
𝑛

Proof:
Let 𝑥1𝑙 be the lower boundary of the first class, then we have a frequency distribution as
Lower Boundary Upper Boundary Class Mark frequency
𝑥1𝑙 𝑟. 𝑥1𝑙 (1 + 𝑟)
𝑥1 = 𝑥1𝑙 𝑓1
2
𝑟. 𝑥1𝑙 𝑟 2 . 𝑥1𝑙 (1 + 𝑟)
𝑥2 = 𝑟. 𝑥1𝑙 = 𝑟. 𝑥1 𝑓2
2
𝑟 2 . 𝑥1𝑙 𝑟 3 . 𝑥1𝑙 (1 + 𝑟)
𝑥3 = 𝑟 2 . 𝑥1𝑙 = 𝑟 2 . 𝑥1 𝑓3
2
⋮ ⋮ ⋮ ⋮
𝑟 𝑖−1
. 𝑥1𝑙 𝑖
𝑟 . 𝑥1𝑙 (1 + 𝑟)
𝑥𝑖 = 𝑟 𝑖−1 . 𝑥1𝑙 = 𝑟 𝑖−1 . 𝑥1 𝑓𝑖
2
⋮ ⋮ ⋮ ⋮
𝑘−1 𝑘
𝑟 . 𝑥1𝑙 𝑟 . 𝑥1𝑙 𝑥𝑘 = 𝑟 𝑘−1 . 𝑥1 𝑓𝑘

1
Now, log 𝐺 = ∑𝑘𝑖−1 𝑓𝑖 log 𝑥𝑖
𝑛
1 1
= ∑𝑘𝑖−1 𝑓𝑖 log(𝑟 𝑖−1 . 𝑥1 ) = ∑𝑘𝑖−1 𝑓𝑖 [log 𝑟 𝑖−1 + log 𝑥1 ]
𝑛 𝑛
1 1
= log 𝑥1 ∑𝑘𝑖=1 𝑓𝑖 + ∑𝑘𝑖−1 𝑓𝑖 [log 𝑟 𝑖−1 ]
𝑛 𝑛

log 𝑟
= log 𝑥1 + ∑𝑘𝑖−1(𝑖 − 1)𝑓𝑖
𝑛
Harmonic Mean:
Let us consider the following problems of finding the average speed:

1. A car travels certain distance in ‘n’ hours. During ith hour it travels at a constant speed
𝑣𝑖 km per hour, i= 1(1) n.

Then the average speed of the car will be

𝑡𝑜𝑡𝑎𝑙 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑐𝑜𝑣𝑒𝑟𝑒𝑑 ∑𝑛


𝑖=1 𝑣𝑖
= -------- (24)
𝑡𝑜𝑡𝑎𝑙 𝑡𝑖𝑚𝑒 𝑡𝑎𝑘𝑒𝑛 𝑛

which is nothing but the arithmetic mean of the values assumed by the variable v which
denotes speed here.

2. Now if the car travels 1st s km at a speed 𝑣1 km per hour, 2nd s km at a speed 𝑣2 km per
hour, ......and so on the last (i.e. nth) s km at a speed 𝑣𝑛 km per hour,

Then the average speed of the car will be

𝑡𝑜𝑡𝑎𝑙 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑐𝑜𝑣𝑒𝑟𝑒𝑑 𝑛𝑠 𝑛


= ∑𝑛 𝑠 = 1 ----- (25)
𝑡𝑜𝑡𝑎𝑙 𝑡𝑖𝑚𝑒 𝑡𝑎𝑘𝑒𝑛 𝑖=1𝑣 ∑𝑛
𝑖=1
𝑖 𝑣𝑖

𝑑
It may be noted that for both the cases the variable under study is ‘speed’ v = 𝑡 , where d denotes
distance and t denotes time and while calculating average, we land up with two different
formulae. The reason is that in both the cases the variable under study is of the form ‘a per
unit b’, but in the 1st case the numerator a (here d) varies but in the 2nd case the denominator
b (here t) varies and accordingly we get two different formulae. 2nd one is known as harmonic
mean.

Thus, if the variable under study is of the form ‘a per unit b’ and the denominator b varies
Then an appropriate measure of central tendency is harmonic mean.

Suppose the variable under study be x.

Basic data:
𝑥1 , 𝑥2 , . . . . . . . , 𝑥𝑛 , 𝑖. 𝑒, 𝑥𝑖 , 𝑖 = 1(1)𝑛 𝑤𝑖𝑡ℎ 𝑥𝑖 ≠ 0 ∀ 𝑖

Then formula for harmonic mean is given by


𝑛
𝑥̅ ℎ = 1 ----------------------------------------- (26)
∑𝑛
𝑖=1 𝑥
𝑖

It may be noted that this formula can be expressed as ‘the reciprocal of the arithmetic mean of
the reciprocals (of the variable x)’.
For an ungrouped frequency distribution like

x 𝑥1 𝑥2 ⋯⋯ 𝑥𝑘 Total
Frequency.(f) 𝑓1 𝑓2 ⋯⋯ 𝑓𝑘 n
The formula for (weighted) harmonic mean is given by

∑𝑘
𝑖=1 𝑓𝑖
𝑥̅ ℎ = 𝑓𝑖 ----------------------------------- (27)
∑𝑘
𝑖=1 𝑥𝑖
𝑛 1 1
= 𝑓𝑖 = 𝑓𝑖 = 𝑤𝑖
∑𝑘 ∑𝑘
𝑖=1
𝑖=1𝑥 ∑𝑘 𝑛 𝑥𝑖
𝑖 𝑖=1 𝑥
𝑖

𝑓
where 𝑤𝑖 = 𝑛𝑖 , the relative frequency of 𝑥𝑖 , is the weight attached to the value 𝑥𝑖 of the variable
x.

[In case of grouped frequency distribution, in formula (27), 𝑥𝑖 can be taken as the class mark of
the ith class.]

Result: Let there be two sets of observations with respective sizes 𝑛1 and 𝑛2 and respective
harmonic means ℎ1 and ℎ2 . Then the harmonic mean for the combined set of 𝑛1 +
𝑛2 observations will be
𝑛 +𝑛2
ℎ𝑐 = 𝑛11 𝑛2 --------------------------------------- (28)
+
ℎ1 ℎ2

Proof: Let the two sets of observations be

1st set: 𝑥11 , 𝑥12 , . . . . . . . , 𝑥1𝑛1 , 𝑖. 𝑒. 𝑥1𝑖 , 𝑖 = 1(1)𝑛1 𝑤𝑖𝑡ℎ 𝑥1𝑖 ≠ 0 ∀ 𝑖

2nd set: 𝑥21 , 𝑥22 , . . . . . . . , 𝑥2𝑛2 , 𝑖. 𝑒. 𝑥2𝑗 , 𝑗 = 1(1)𝑛2 𝑤𝑖𝑡ℎ 𝑥2𝑗 ≠ 0 ∀ 𝑗

Then by (26)
𝑛1 𝑛 1 𝑛
ℎ1 = 𝑛1 1 ⇒ ∑𝑖=1
1
= ℎ1 ------------------------------------------ (29)
∑𝑖=1 𝑥1𝑖 1
𝑥1𝑖

Similarly, for 2nd set we can have

1 𝑛
∑𝑛𝑗=1
2
= ℎ2 ---------------------------------------- (30)
𝑥2𝑗 2
Again ℎ𝑐 , being harmonic mean of the combined set of 𝑛1 + 𝑛2 , will be
𝑛1 +𝑛2
ℎ𝑐 = 𝑛1 1 𝑛2 1
∑𝑖=1 + ∑𝑗=1
𝑥1𝑖 𝑥2𝑗

𝑛 +𝑛2
=𝑛11 𝑛2 [using (29) and (30)]
+
ℎ1 ℎ2
[The result can easily be generalised for k sets of observations].

Result:
For a set of n positive observations AM ≥ GM ≥ HM
Proof:
Let 𝑥1 , 𝑥2 , . . . . . . . , 𝑥𝑛 , 𝑤𝑖𝑡ℎ 𝑥𝑖 > 0∀𝑖 be a set of n positive observations on some variable x.

Then
1
AM = 𝑛 ∑𝑛𝑖=1 𝑥𝑖 = A (say)
1
GM = (∏𝑛𝑖=1 𝑥𝑖 ) = G (say)
𝑛

𝑛
HM = 1 = H (say)
∑𝑛
𝑖=1 𝑥
𝑖

First we shall prove AM ≥ GM.

To start with, let us first take n=2

Then we need to show


1
𝑥1+𝑥2
≥ (𝑥1 𝑥2 )2
2

i.e, to show 𝑥1 + 𝑥2 − 2√𝑥1 𝑥2 ≥ 0

i.e, (√𝑥1 − √𝑥2 )2 ≥ 0, which is trivial.

So AM ≥ GM for n=2

Now for n=4 observations


𝑥1 +𝑥2 𝑥3 +𝑥4
𝑥1 +𝑥2 +𝑥3 +𝑥4 +
AM= = 2 2
4 2

√𝑥1 𝑥2 +√𝑥1𝑥2
≥ [since we have proved that AM ≥ GM for n=2]
2

≥ √(√𝑥1 𝑥2 )(√𝑥3 𝑥4 ) [since we have proved that AM ≥ GM for n=2]

= 4√𝑥1 𝑥2 𝑥3 𝑥4
Thus AM ≥ GM for n=4

Proceeding in this way,


we can prove that AM ≥ GM for n=2𝑚 , for any positive integer m. ------------(31)

[We have taken the choice of n to be of the form 2𝑚 because while proving the result for n=4
we need to divide the number of observations into two halves so that we can use AM ≥ GM for
n=2. Then if each half can be divided into two halves, the fact that AM ≥ GM for n=2 can
repeatedly be used. Thus at each step, each half will have to be divided into two halves so that
we need n=2𝑚 ]

Now suppose that n ≠ 2𝑚

Then we can always find some positive integer m such that


2𝑚−1 < n < 2𝑚

It may be noted that the AM and GM of the given n observations are A and G respectively.
Let us augment (2𝑚 − 𝑛) additional observations along with 𝑥1 , 𝑥2 , . . . . . . . , 𝑥𝑛 as

𝑥𝑛+1 = 𝑥𝑛+2 =. . . . . . . = 𝑥2𝑚 = 𝐴

Now with these augmented observations we have 2𝑚 observations having


𝑚
∑𝑛 2
𝑖=1 𝑥𝑖 + ∑𝑖=𝑛+1 𝑥𝑖 𝑛.𝐴+(2𝑚 −𝑛)𝐴
AM = = =𝐴
2𝑚 2𝑚

𝑚 1 𝑚 −𝑛 1
And GM=(∏𝑛𝑖=1 𝑥𝑖 ∏2𝑖=𝑛+1 𝑥𝑖 )2𝑚 = (𝐺 𝑛 . 𝐴2 )2𝑚

Since we have proved that AM ≥ GM for n=2𝑚


1
(𝐺 𝑛 2𝑚 −𝑛 )2𝑚 𝑚 𝑚
𝐴≥ .𝐴 ⇒ 𝐴2 ≥ 𝐺 𝑛 . 𝐴(2 −𝑛)
⇒ 𝐴 ≥ 𝐺 ------------------------------------------ (32)
Thus
AM ≥ GM ----------------------------------------- (33)
for any number of positive observations on some variable x.

Now, since 𝑥𝑖 > 0 ∀ 𝑖, we can always define a new variable y as


1
y = 𝑥 and 𝑦𝑖 > 0 ∀ 𝑖

Then from (33)


1
1 1 1 1 1 1 1
∑𝑛 𝑦 ≥ (∏𝑛𝑖=1 𝑦𝑖 )𝑛 ⇒ ∑𝑛𝑖=1 ≥ (∏𝑛𝑖=1 )𝑛 ⇒ 𝐻 ≥
𝑛 𝑖=1 𝑖
1
𝑛 𝑥𝑖 𝑥𝑖
(∏𝑛
𝑖=1 𝑥𝑖 )
𝑛

1 1
⇒𝐻 ≥ ⇒𝐺 ≥𝐻
𝐺

i.e. GM ≥ HM ---------------------------------(34)

Combining (33) and (34)

AM ≥ GM ≥ HM

Note that: Equality holds when all the observations are equal.

Problem: Let x be a variable assuming positive values only. Show that


a) The arithmetic mean of the reciprocal of x cannot be smaller than the reciprocal of its
arithmetic mean.
b) The arithmetic mean of the square root of x cannot be greater than the square root of its
arithmetic mean.
Proof: Let 𝑥1 , 𝑥2 , . . . . . . . , 𝑥𝑛 , 𝑤𝑖𝑡ℎ 𝑥𝑖 > 0∀𝑖 be a set of n positive observations on some
variable x.
1 1
a) The arithmetic mean of the reciprocal of x will be ∑𝑛𝑖=1 and the reciprocal of the
𝑛 𝑥𝑖
𝑛
arithmetic mean of x is ∑𝑛
𝑖=1 𝑥𝑖
We know that for a set of n positive observations AM ≥ HM, i.e.,
1 𝑛 𝑛 1 1 𝑛
∑ 𝑥 ≥ 𝑛 1 ⇒ ∑𝑛𝑖=1 ≥ 𝑛
𝑛 𝑖=1 𝑖 ∑𝑖=1 𝑛 𝑥 ∑ 𝑖𝑥 𝑖=1 𝑖
𝑥𝑖

Hence the proof.


1
b) The arithmetic mean of the square root of x is ∑𝑛𝑖=1 √𝑥𝑖 and the square root of the
𝑛
1
arithmetic mean of x is √𝑛 ∑𝑛𝑖=1 𝑥𝑖

We know, from Cauchy- Schwartz inequality, that for two sets of real numbers
(𝑎1 , 𝑎2 , . . . . . . , 𝑎𝑛 )and (𝑏1 , 𝑏2 , . . . . . . , 𝑏𝑛 )

(∑𝑛𝑖=1 𝑎𝑖2 )(∑𝑛𝑖=1 𝑏𝑖2 ) ≥ (∑𝑛𝑖=1 𝑎𝑖 𝑏𝑖 )2 ------- (35)

Let us take 𝑎𝑖 = √𝑥𝑖 and 𝑏𝑖 = 1 in (35)


Then
(∑𝑛𝑖=1 𝑥𝑖 )(∑𝑛𝑖=1 1) ≥ (∑𝑛𝑖=1 √𝑥𝑖 )2
1 1 2
⇒ ∑𝑛𝑖=1 𝑥𝑖 ≥ ( ∑𝑛𝑖=1 √𝑥𝑖 ) [Dividing both sides by 𝑛2 ]
𝑛 𝑛
Taking positive square root, we get the result.

Mode:
Mode is the value of a variable corresponding to the maximum of the ideal curve which gives
the closest possible fit to the actual distribution. It represents the value which is most frequent
or typical. Mode refers to the value which is, in fact the fashion (La Mode). Empirically, we
can find a crude approximation to the mode as follows:
1. For ungrouped frequency distribution, mode = 𝑥𝑚 , where 𝑥𝑚 is the observation
corresponding to the maximum frequency 𝑓𝑚 .
2. For a grouped frequency distribution, it is a bit difficult to obtain the mode. Under the
assumption of uniformity of distributions of frequencies over all classes, we can find
what is called the modal class that corresponds to the maximum frequency. Let mth
class be the modal class.
𝑥 +𝑥
Then, a very crude approximation to the mode is, 𝑚𝐿 2 𝑚𝑈
Where 𝑥𝑚𝐿 𝑎𝑛𝑑 𝑥𝑚𝑈 are respectively the lower class and upper class boundaries of
the modal class.
A modified formula involving the frequencies of the modal class and its two adjacent
classes is given by

0 −1𝑓 −𝑓
Mode, 𝑀0 = 𝑥𝑚𝐿 + 2𝑓 −𝑓 𝑤𝑚 ------------- (36)
−𝑓
0 −1 1
Where 𝑓0 is the frequency of the modal class
𝑓−1 : frequency of the class preceding the modal class.
𝑓1 :frequency of the class next to the modal class.
𝑤𝑚 :width of the modal class.

There is an empirical formula for bell-shaped distribution which are not much
deviated from symmetry and is given by

Mode = 3 median – 2 mean

For some distributions we may have more than one mode. These are called multi
modal distributions.
Result: Effect of change of origin and scale on Mode:
If y = ax + b and mode of x be 𝑥0 , then mode of y will be (a𝑥0 + b).

You might also like