Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
61 views13 pages

CH 3

This document discusses measures of central tendency, which are single values that describe the central characteristics of a data set. It introduces the arithmetic mean as a measure of central tendency that is calculated by summing all values and dividing by the total number of observations. The document outlines important properties of a good measure of central tendency, such as being easy to calculate and representative of the data. Finally, it discusses other common measures of central tendency including the median, mode, and quartiles.

Uploaded by

Byhiswill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views13 pages

CH 3

This document discusses measures of central tendency, which are single values that describe the central characteristics of a data set. It introduces the arithmetic mean as a measure of central tendency that is calculated by summing all values and dividing by the total number of observations. The document outlines important properties of a good measure of central tendency, such as being easy to calculate and representative of the data. Finally, it discusses other common measures of central tendency including the median, mode, and quartiles.

Uploaded by

Byhiswill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Measures of central tendency

CHAPTER 3
3.1 Motivating Example

In the previous chapter, we deal on collection and summarizing the data using tables and graphs.
Now suppose the following example, students from two or more classes appeared in the
examination and we wish to compare the performance of the classes in the examination or wish
to compare the performance of the same class after some coaching over a period of time. When
making such comparisons, it is not practicable to compare the full frequency distributions of
marks. However compactly these may be presented. Therefore, for such statistical analysis, we
need a single representative value that describes the entire mass of data given in the frequency
distribution. This single representative value is called the central value, measure of location or an
average around which individual values of a series cluster. This central value or an average
enables us to get a gist of the entire mass of data, and its value lies somewhere in the middle of
the two extremes of the given observations. For this reason, such a central value or an average is
frequently called a measure of central tendency is needed. That is, a single value that describes
the characteristics of the entire mass of data is called measures of central tendency or average.
Note that: The concept of a measure of central tendency is concerned only with quantitative
variables and is undefined for qualitative variables as these are immeasurable on a scale.

3.2. Objectives of Measures of Central Tendency

The following are the main objectives of having measure of central tendency:

 To comprehend the data easily.


 To get a single value that represent(describe) characteristics of the entire data
 To summarizing/reducing the volume of the data
 To facilitating comparison within one group or between groups of data
 To enable further statistical analysis

3.3. Summation Notation

Σ
The symbol denotes the word “the sum of “. Suppose you have n observation x1, x2, . . . xn,
n
∑ xi
i=1
then the sum x1 + x2 + . . . + x n can be rewritten as and read as the sum of all X i (all
measurements), i = 1, 2, . . . , n.

Some Properties of the Summation Notation


n
∑c
1. i=1 = n . c where c is a constant number.

Probability and statistics Page 1


Measures of central tendency
n n
∑ b . x i=b ∑ x i
2. i=1 i=1 where b is a constant number
n n
∑ ( a+bx i )=n . a+ b ∑ xi
3. i=1 i=1 where a and b are constant numbers
n n n
∑ ( x i± y i )= ∑ x i±∑ y i
4. i=1 i=1 i=1
n n n
∑ x i y i ≠ ∑ xi ∑ yi
5. i=1 i=1 i=1
Example 3.1:
Let
12 12 12 2 12 2
∑ x i = 26 , ∑ y i = 17 , ∑ x i =484 , ∑ y i = 362
i=1 i=1 i=1 i=1

then find
12 12

i) ∑ ( 4 x i +3 y i ) , ii) ∑ 2 x i ( x i−7 )
i =1 i =1

3.4. Important Characteristics of a Good Measure of Central Tendency


A good measure of central tendency possess most of the following properties:
 It should be easy to calculate and understand (interpret).
 It should be based on all the observations during computation.
 It should be rigidly defined. The definition should be clear and unambiguous so that it
leads to one and only one interpretation by different persons. So that the personal
biases of the investigator doesn’t affect the value of its usefulness.
 It should be representative of the data, if it’s from sample. Then the sample should be
random enough to be accurate representative of the population.
 It should have sampling stability. It shouldn’t be affected by sampling fluctuations.
This means that if we pick (take) two independent random samples of the same size
from a given population and compute the average for each of these samples then the
value obtained from different samples should not vary much from one another.
(i.e. we should expect to get approximately the same result from the two samples
taken from one population).
 It shouldn’t be affected by the extreme value. If a few very small and very large items
are presented in the data, they will influence the value of the average by shifting it to
one side or of other side and hence the average chosen should be such that is not
influenced by the extreme values.
 Being capable for further statistical analysis and /or algebraic manipulation.
3.5. Measures of Central Tendencies (MCT)
Probability and statistics Page 2
Measures of central tendency
In statistics, there are various types of measures of central tendencies with their advantage and
disadvantage. The most commonly used types of measures of central tendencies (MCT)
includes:-

- Mean - Median
- Mode - Quintiles (Quartiles, deciles and percentiles)
3.5.1. Mean
There are four types of means. These are arithmetic mean, weighted (arithmetic mean),
Harmonic mean and Geometric mean. Depending on the nature of the data and interest of the
investigator these different types of means used.
3.5.1.1. Arithmetic mean ( X )

It is defined as the sum of the measurements of the items divided by the total number of items.

Arithmetic mean for ungrouped data (Frequency Distribution)

When the data are arranged or given on the form of ungrouped frequency distribution, then the
formula for the mean is
k

f 1 x 1 + f 2 x 2 +…+ f k x k i=1
∑ f i xi
X= = k where f i is frequency of x i∨N
f 1 +f 2+ …+f k
∑ fi
i=1

k
Where fi is the frequency of xi (the ith observation) and ∑ f i = n
i=1

Example 3.2: The measurement obtained from the body lengths (in inches) of 10 full-term
infants at birth recorded as follow:

17.5, 19.5, 17.5, 19, 20, 21, 18, 19.5, 18, 10.75

Compute the mean length of the infants for these data.

Arithmetic Mean for Grouped data (Frequency Distribution)

If data are given in the form of continuous (grouped) frequency distribution, the sample mean
k
∑ f m i i f m + f m +. . . + f m
1 1 2 2 k k
x= i=1 k =
∑f f + f + .. .+ f
1 2 k
can be computed as i =1
i

th
Where mi is he class mark of the i class; i = 1, 2, …, k

Probability and statistics Page 3


Measures of central tendency
k
th
fi = the frequency of the i class, k = the number of classes and ∑ f i = n (total observation)
i=1

Example 3.3: The following table gives the daily wages of laborers. Calculate the average daily
wages paid to a laborer.

Wages in birr 11-13 13-15 15-17 17-19 19-21 21-23 23-25


Number of laborers 3 4 5 6 6 4 3

Properties of the Arithmetic Mean


 The sum of the deviations of the items from their arithmetic mean is zero. This means, the

algebraic sum of the deviations of a set of numbers


x 1 , x 2 , . . ., x n from their mean x̄ is zero. ie,
n

 (x
i 1
i  x)  0

 When a set of observations is divided into k groups and x̄ 1 is the mean of n1 observations of
x̄ n
group 1, x̄ 2 is the mean of n2 observations of group2, …, k is the mean of k observations
of group k , then the combined mean ,denoted by
x̄ c , of all observations taken together is
given by
k

n x + n x +…+ nk x k i=1
∑ ni x i
X c= 1 1 2 2 = k
n1 +n 2+ …+nk
∑ ni
i=1

 If a wrong figure has been used in calculating a mean, we can correct the mistake, if we
know the correct figure that has been used.
Let X wr denote the wrong figure used in calculating the mean, X c be the correct figure that
should have been used and X wr be the wrong mean calculated using X wr , then the correct
mean, X correct , is given by
n X wr + X c − X wr
X correct ¿
n
 If the mean of
x 1 , x 2 , . . ., x n is x̄ , then

i) The mean of
x 1±k , x 2±k , . . . , x n ±k will be x̄±k
kx , kx , .. .,kx n will bek x̄ .
ii) The mean of 1 2
Example 3.4:
Last year there were three sections taking Stat 275 course in AMU. At the end of the semester,
the three sections got average marks of 80, 83 and 76. There were 28, 32 and 35 students in each
section respectively. Find the mean mark for the entire students.
n x̄ + n x̄ +n x̄ 28(80 )+32( 83 )+35 (76) 7556
x̄ c= 1 1 2 2 3 3 = = =
Solution: n 1 + n 2 + n 3 28+32+35 95 79.54
Example 3.5:
Probability and statistics Page 4
Measures of central tendency
An average weight of 10 students was calculated to be 65 kg, but latter, it was discovered that
one measurement was misread as 40 kg instead of 80 kg. Calculate the corrected average weight.
n X wr + X c − X wr 10 ( 65 ) +80−40
Solution: X correct ¿ = =69
n 10
3.5.1.2. Weighted Arithmetic Mean
In finding arithmetic mean, all items were assumed to be of equally importance (each value in
the data set has equal weight). When the observations have different weight, we use weighted
average or weighted arithmetic mean. Weights are assigned to each item in proportion to its
relative importance.

If
x 1 , x 2 , . . ., x k represent values of the items and w 1 , w 2 , . . . , w k are the corresponding weights, then

the weighted mean, ( x̄ w ) is given by


k

w x + w x +…+ wk x k i=1
∑ wi xi
X w= 1 1 2 2 = k
w1 +w 2+ …+w k
∑ wi
i=1

Example 3.6:
A student’s final mark in Mathematics, Physics, Chemistry and Biology are respectively 82, 80,
90 and 70.If the respective credits received for these courses are 3, 5, 3 and 1, determine the
approximate average mark the student has got for one course.
Solution:
We use a weighted arithmetic mean, weight associated with each course being taken as the
number of credits received for the corresponding course.
xi 82 80 90 70
wi 3 5 3 1

x̄ w =
∑ w i x i = (3×82)+(5×80 )+(3×90 )+(1×70 ) =82 .17
∑ wi 3+5+3+1
Therefore, average mark of the student for one course is approximately 82.
Exercise 3.1:
If a student gets A in 4 cr. hrs, B in 3 cr. hrs and D in 2 cr. hrs courses, what is semester GPA of
the student?
Merits of Arithmetic Mean  Arithmetic mean is also capable of
 Arithmetic mean has a rigidly defined further algebraic treatment.
mathematical formula so that its value is  It affords a good standard of
always definite. comparison.
 It is calculated based on all observations. Drawbacks of Arithmetic Mean
 Arithmetic mean is simple to calculate  It is highly affected by extreme
and easy to understand. (abnormal) values in the series.
 It doesn’t need arrangement of data in  It can be a number which does not
increasing or decreasing order. exist in the series.
Probability and statistics Page 5
Measures of central tendency
 It sometime gives such results which  It can’t be calculated for open-ended
appear almost absurd. For example it classes.
is likely that we can get an average
of ‘3.6 children’ per family.
3.5.1.3. Geometric Mean (GM):
Geometric mean used when the observed values are measured as ratios, percentages, proportions,

indices or growth rates.


GM = √x
n
1. x .. .. x
2 n

If the observed data have frequencies GM = √x


n f1
1 . x
f2
2 .. .. x
fk
k

Example 3.7: Compute the geometric mean for the following values: 2, 8, 6, 4, 10, 6, 8, 4
Solution:
Values 2 4 6 8 10 Total
frequencies 1 2 2 2 1 8
√ 2∗4
2 2 2
6 8 ∗10=5 . 41
8
GM = ∗ .¿

Merits of Geometric Mean Limitations of Geometric mean


 It is based upon all observations  Its calculations are not as such easy
 It is capable of further algebraic  It may not be defined even it a single
treatment observation is negative.
 It is rigidly defined  If the value of one observation is zero its
values becomes zero
3.5.3.4. Harmonic Mean:
It is appropriate measure of central tendency in situations where data pertains to speed, rate and
time. The harmonic mean is defined as the number of values divided by the sum of the
reciprocals of each value of the individual observations.

Let X1, X2, . . . , X n be n observations from the population. The sample harmonic mean ( X HM )

{
n n
=
data is not∈frequency distribution form
1 1 1 1
+ +. . .+
x1 x2 xn
∑x
X
is given by: HM = i
n
data is∈ frequency distribution form

()
k
fi
∑ x
i=1 i

Merits of Harmonic Mean  It is not affected much by


fluctuation of sampling
 It is rigidly defined  It is based on all observation
 It is capable of further algebraic treatment in a distribution
Probability and statistics Page 6
Measures of central tendency
 Used in a situations where small weight is give for larger  Difficult to calculate
observation and larger weight for smaller observation and understand
 It may not be defined
Demerits of Harmonic Mean even it a single
observation is zero.
Example 3.8:

A motorist travels 1440km in 3 days. She travels for 10 hours at rate of 48km/hr on 1 st day, for 12
hours at rate of 40km/hr on the 2nd day and for 15 hours at rate of 32km/hr on the 3 rd day. What is her
average speed?

3
X HM = =38.92
Solution: 1 1 1
+ +
48 40 32

 Relations among different means

x ≥GM ≥HM
i.

ii. For two observations


√ x∗HM =GM
x =GM =HM
iii. if all observation have equal magnitude

3.5.2. Median

The Median is the value of a variable which divides a distribution in to two equal parts. In a set
of ordered observations, it is a value of a variable that have half of the number of observations
below it and the remaining half above it.

Median for ungrouped data set


~
x
Let X1, X2, . . . , Xn be n ordered samples observations, the median ( ) is given by:

~x=¿ x if the number of items, n, is odd ¿¿¿


n+1
{ 2
¿
Example 3.9:
The birth weights in pounds of five babies born in a hospital on a certain day are 9.2, 6.4, 10.5, 8.1
and 7.8. Find the median weight of these five babies.
Probability and statistics Page 7
Measures of central tendency
Solution: Arrange the data in ascending or descending order the median will be 8.1.
Example 3.10:
At rest pulse rates for 16 athletes at a meet are: 53, 54, 56, 57, 58, 56, 54, 64, 67, 57, 54, 55, 57,
68, 60 and 58. Find the median value.
Solution: The first step in computing the median value is arranging the data in ascending order.
And hence the ascending data are: 53, 54, 54, 54, 55, 56, 56, 57, 57, 57, 58, 58, 60, 64, 67, and
68. Here n is even number, and then the appropriate formula is

~
{
1
x = ( x n + x n+2 ) ordered value for n (is even)
2 2 2

( ) ( ) = 8 +9
th th
16 16
+ +1 th th
2 2 57+57
= = =57
2 2 2

Median for grouped data


When the data set given with class intervals, then to find the median
i. Arrange the distribution in ascending order
ii. Construct the less than cumulative frequency
iii. Identify the median class interval. (For instance, let n be number of observation in the
distribution. To identify the median class interval first determine the constant (n/2).
Secondly search out the minimum less than cumulative frequency which is greater or
equal to the constant (n/2). Then the class interval corresponding to the minimum less
than cumulative frequency which is greater or equal to the constant (n/2) is the median
class interval.)
iv. Find the unique median value by adopting the formula

( )
n
−F
~ 2
X= Lmed +W
f med

Where: Lmed is lower class boundary of the median class interval; n is total number of
observation in the distribution; F is the less than cumulative frequency of a class above the
median class interval; W is the class width of the median class interval; fmed is frequency of the
median class interval.
Merits of median  Used to characterized qualitative data
 Can be located even when the data are
 Always exists
incomplete
 Not affected by extreme values or it is
insensitive for any outliers.
Probability and statistics Page 8
Measures of central tendency
 In case of grouped data, if any class  The arrangement of items in order of
interval is open it can be calculated. magnitude is sometimes very tedious
 It can be determined graphically process if the number of items is very
large
 It is not capable of further algebraic
Demerits of median treatment.
 It is more likely to be affected by  It is not a good representative of the data
sampling fluctuations. if the number of items (data) is small
 It is not based on all observations
Example 3.11:
The production of butter fat during 7 consecutive days was recorded for 75 cows and the frequency
is given below.
Butter Fats in kgs 40-44 45-49 50-54 55-59 60-64 65-69 70-74
Number of cows 7 10 22 15 12 6 3
Solutions:
i. First find the less than cumulative frequency.
ii. Identify the median class.
iii.Find median using formula.
Butter fats in kgs 40-44 45-49 50-54 55-59 60-64 65-69 70-75
Number of cows 7 10 22 15 12 6 3
Cumulative frequency 7 17 39 54 66 72 75
n 75
= =37 .5
2 2

⇒ 39 is the first cumulative frequency to be greater than or equal to 37.5. So 55- 59 is the
median class. Lmed = 49.5, F = 17, W = 5, f = 22

( )
n
−F
~ 2
X= Lmed +W
f med
5
=49 . 5+ ( 37 . 5−17 ) = 54 . 16
22
The median butter fat during 7 consecutive days was 54.16 Kg. This is interpreted as 50 % of
cows has less than or equal to 54.16Kg Butter fats.
( μ^ , ^x )
3.5.3. Mode
Mode is another type of measure of central tendency. The mode or the modal value is the most
frequently occurring score (observation) in a series. Note that the mode may not exist in the
series or, even if it exists, it may not be unique.

Probability and statistics Page 9


Measures of central tendency
Mode for ungrouped data
For ungrouped data mode can be obtained by inspections. We classify a distribution as unimodal,
bimodal, or multimodal distributions based on the number of modal values exist in a data set.

Mode for grouped data

To find the mode for continuous distributions first we have to check that the distribution must be
in ascending order and the class intervals for all classes are equal. When the data set are given
with class intervals, then to find the unique modal value
 Find the modal class interval by inspection (the most frequent class interval)

 Let the maximum frequency be f


p then the modal class interval is x −x p p+1

 Compute the unique modal value by interpolation


Δ1
x^ = L + o ∗w
Δ1 + Δ2
where

Δ1 =f p−f p−1 Δ2 =f p−f p +1 w = x p+1−x p


, ,
Lo - The lower class boundary of the modal class interval
f p - Frequency of the modal class interval
f p−1 - Frequency of the class interval preceding the modal class interval
f - Frequency of the class interval following the modal class interval
p+1

w - The class width of the modal class interval

Example 3.12:
In a random sample 10 insects had the following weights in milligrams: 70, 120, 110, 101, 88,
83, 95, 98, 107, and 100. Find the modal value for weights of insects.
Solution: For ungrouped frequency distribution it can be found by inspections. And hence the
mode is 110 weights in milligrams.
Example 3.13: (From the butter fat example 3.11)

Compute the modal butter fat production during 7 consecutive days.

Solution: the data set given was the following


Butter Fats in kgs 40-44 45-49 50-54 55-59 60-64 65-69 70-75
Number of cows 7 10 22 15 12 6 3

Probability and statistics Page 10


Measures of central tendency
50-54 is the modal class since it is a class with highest frequency.

Lo =49 . 5 , f p =22 , f p−1 =10 , f p+1=15 , Ι =5


Δ1 =f p−f p−1 =22−10=12
Δ2 =f p− f p +1 =22−15=7
Δ1
x^ = L + o ∗Ι =49. 5+
12
∗5=49 .5+ 3. 53=53. 03
Δ1 + Δ2 12+7

Merits of mode Demerits of mode


 Not affected by extreme values  The value of the mode cannot always be
or it is insensitive for any determined. In some cases we may have
outliers. bimodal, or multimodal
 Can be located even when the  It is more likely to be affected by sampling
data are incomplete fluctuations.
 In case of grouped data, if any  The arrangement of items in order of magnitude
class interval is open it can be is sometimes very tedious process if the number
calculated. of items is very large
 It can be determined  It is not capable of further algebraic treatment.
graphically  It is not based on all observations
3.5.4. Quintiles
Quintiles are values which divides the data set arranged in order of magnitude in to certain equal
parts. They are averages of position (non-central tendency). Some of these are quartiles, deciles
and percentiles.
3.5.4.1. Quartiles:
Q1 , Q2 Q3
Quartiles are values which divide the data set in to four equal parts, denoted by and . The
first quartile is also called the lower quartile and the third quartile is the upper quartile. The second
quartile is the median.
For Ungrouped data For grouped data

Let
Q j be the j th quartile Apply the following formula to find the jth quartile

value for j  1, 2, 3 .
( )
j⋅n
− FQ j
4
Then Q j=LQ + W ; j=1 , 2 , 3 .
j f Qj

Q j= the j th quartile we are going to calculate


( )
th
j Where
Q j= ( n+1 ) item; j=1 , 2 , 3 .
4 LQ = th
j Lower class boundary of the j quartile class
FQ = th
j Sum of frequencies of all classes lower than the j
quartile class
fQ = th
j Frequency of the j quartile class and W = Class width

Probability and statistics Page 11


Measures of central tendency
th
The j quartile class is
the class with the smallest
cumulative frequency
greater than or equal to
j⋅n
4.

3.5.4.2. Deciles:
Deciles are values dividing the data in to ten equal parts, denoted by 1 2
D , D , ..., D
9 . The fifth decile is
the median.
For ungrouped data For grouped data
D th Apply the following formula:
Let j be the j decile value for j  1, 2, ... , 9 .

( )
j⋅n
Then −F D
10 j
D j=L D + W ; j=1 , 2 , . . . , 9

( )
th
j j fD
D j= ( n+1 ) item ; j=1 , 2 , . . . , 9 j

10
th
Define the symbols similar way as we did in the case of quartiles. The j deciles class is the class
j⋅n
with the smallest cumulative frequency greater than or equal to 10 .
3.5.4.3. Percentiles:
Percentiles are values which divide the data in to one hundred equal parts, denoted by 1 2 99 . P , P , ... P
The fiftieth percentile is the median.
For ungrouped data For grouped data
P
Let j be the percentile value for j=1,2,..,99
Apply the following formula:

( )
j⋅n

( )
j
th −F P
Then P j= (1+n) item, j=1, 2 , … , 99 100 j
P j=L P + W ; j =1, 2 , 3 , . .. , 99
100 j fP
j

th
Define the symbols similar way as we did in the case of quartiles.The j percentile class is the class
j⋅n
with the smallest cumulative frequency greater than or equal to 100 .

Interpretations
1.
Q j is the value below which ( j×25) percent of the observations in the series are found (where
j  1, 2, 3 ). For instance Q 3 means the value below which 75 percent of observations in the given
series are found.
D
2. j Is the value below which( j×10) percent of the observations in the series are found (where
j  1, 2, ... , 9 ). For instance D 4 is the value below which 40 percent of the values are found in the
series.
3.
P j is the value below which j percent of the total observations are found (where j=1, 2, 3, . . . , 99 ).
P
For example 73 percent of the observations in a given series are below 73 .
Probability and statistics Page 12
Measures of central tendency
Exercise 3.2:
The following table presents the male population of a certain region in Ethiopia.
Find a) all quartiles
th th
b) The 9 and5 decile and
th th
c) 65 and 75 percentiles
Age groups (in years) 0 – 5 5 – 10 10 – 15 15 – 20 20 – 25 25 – 30 30 – 35 35 - 40
Male population 2580 3737 4620 5200 7250 620 297 355

Probability and statistics Page 13

You might also like