Math 106 Lecture 7
Measures of Central Tendency
(summarizing data with a single number)
Mean, Median, Mode,
Grouped Data
Intro to Dispersion: Quartiles
1
© m j winter, ss2003
Mean, Median, Mode
Data points: x1, x2, …., xn
Mean: x1 + x2 + ... + xn ∑ x
x= = =µ
n n
Median: list data in order: x1 < x2 < …. < xn if
n is odd, median is middle point
n is even, median is average of the two
middle points
Mode: the value of x which occurs most
often. May be more than one; may be
none.
2
1
Example: Starting
Salaries
Data Set - Starting Salaries of Basket-weaving Majors:
$27000, $27,000, $49,500, $37,300, $487,000,
$15,000, $32,000, $37,500, $41,300
What was the mean (average) starting salary?
What was the median starting salary?
What was the mode?
Example - 2
Data Set - Starting Salaries of Basket-weaving Majors:
$27,000 $27,000, $49,500, $37,300, $487,000,
$15,000, $32,000, $37,500, $41,300
Mean salary? (Add the salaries and divide by 9)
$83,733.33
Median salary? (List in order, take middle)
15.0 27.0 27.0 32.0 37.3 37.5 41.3 49.5 487
Mode? $27,000
How useful are these numbers?
4
2
Measure of Central Tendency
Advantages, Disadvantages
Mean can be influenced by outliers.
useful mathematically
most useful when data is ‘continuous’
Median also a central number. Often more meaningful.
However, it is possible there is no data point anywhere near
the mean or median (or very few)
Mode useful when data is discrete – such as number of cars in
a family, etc.
Questions
x1 < x2 < x3 < x4 <x5 < x6 <x7 <x8 <x9< x10
calculate the mean µ = x + x + ... + x
1 2 10
10
and the median, m = x5 + x6
2
Now increase the largest number by 20. What is the new
mean? The new median?
New mean =
x1 + x2 + ... + ( x10 + 20) 20
=µ+ = µ+2
10 10
The median does not change. x5 + x6
2 6
3
Detour – weighted averages
Calculate the average of: 3.2, 3.2, 3.2, 4.0, 2.5, 2.5
3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5
6
3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5 3*(3.2) + 1*(4.0) + 2*(2.5)
=
6 6
3 1 2
= (3.2) + (4.0) + (2.5) = 3.10
6 6 6
This is a weighted average.
3 1 2
The numbers , , are called the weights.
6 6 6
Note that the sum of the weights is 1.
7
Another example of Weighted Averages
A student’s test average is 3.1 and the grade on the final
exam is 2.8. If the exam is to count as 1/4 of the final
average, how is this average computed?
The weights are 3/4 and 1/4.
3 1 3 3.1 + 1 2.8
3.1 + 2.8 = = 3.025
4 4 4
4
Mean or Average of Grouped Data
Set of 17 integers
between 2 and 9 7 7
(inc)
[2, 3] 7
4
Freq
[4, 5] 3 3 3
[6, 7] 3
[8,9] 4
0
3.0 5.0 7.0 9.0
2 unnamed 9
With data in a group, use the midpoint value.
9
Mean or Average of Grouped Data - 2
7
Set of 17 numbers 7
between 2 and 9
(inc)
Use mid-interval 4
value. Freq
3 3
0
3.0 5.0 7.0 9.0
7 ⋅ 2.5 + 3 ⋅ 4.5 + 3 ⋅ 6.5 + 4 ⋅ 8.5 2 unnamed 9
x= = 4.9705..
17
10
5
Calculating the mean from a relative
frequency (density) histogram
7 .412
7
.235
4
Freq ..176
3 .176
3
0
3.0 5.0 7.0 9.0
2 2.5 6.5
4.5unnamed 8.5 9
.412 * 2.5 + .176 * 4.5 + .176 *6.5 + .235 *8.5 = 4.964..
11
Here’s the original data
frequencies for noname.fma (column 1)
4 4
2 ... 2 4 23.53%
3 3 3
3 ... 3 3 17.65%
4 ... 4 2 11.76%
5 ... 5 1 5.88% 2
Freq
6 ... 6 3 17.65%
7 ... 7 0 1 1
8 ... 8 3 17.65%
9 ... 9 1 5.88% 0
2.0 3.0 4.0 5.0 6.0
2
6 78.08 9.0910.0
2 3 4 5unnamed 9
mean value: 4.76
The wider the bins, the more information you lose.
12
6
The wider the bins, the more information
you lose.
The next slide shows four histograms formed from the
same data. The means are listed in the center.
they come from
Grouping Will Change the Mean!
http://www.shodor.org/interactivate/activities/histogram/index.html
13
139.84
147.2
112.24
206.00
14
7
Elevator-Simulation Examples
Number of time passengers got off at
different floors (3 passengers, 6 floors)
• List 1: (10 trials)
6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5
• List 2: (100 trials)
56, 52, 49, 56, 50, 57, 61, 63, 56, 52, 55, 58, 49, 64,
51, 51
• List 3: (400 trials)
213, 231, 221, 215
15
Sorted Lists
List 1
3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Mean = Median = Mode =
List 2
49 49 50 51 51 52 52 55 56 56 56 57 58 61 63 64
Mean = Median = Mode =
List 3
213, 215, 221, 231
Mean = Median = Mode =
16
8
Sorted Lists
List 1
3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Mean = 6.33 Median = 6 Mode = 6
List 2
49 49 50 51 51 52 52 55 56 56 56 57 58 61 63 64
Mean = 55 Median = 55.5 Mode = 56
List 3
213, 215, 221, 231
Mean = 220 Median = 218 No Mode
17
Quartiles - use with median
3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Median is midpoint - the number of elements below the
median equals the number above it.
First quartile: Take the median of the lower half.
Third quartile: Take the median of the upper half.
3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
interquartile range: 8 – 5 = 3
18
9
Interquartile Range, Box Plot, 5-number summary
Roadhog: someone who takes his half of
the road out of the middle
The interquartile range is the width
(range) of the middle half of your data.
5 number summary: {3,5,6,8,9}
min first quartile median third quartile max
3 5 6 8 9
25.0% 25.0%
5 8 19
Estimating the mean from the 5-number
summary
5 number summary: {3,5,6,8,9}
interval weight midpoint
[3,5] 1/4 4
[5,6] 1/4 5.5
[6,8] 1/4 7
[8,9] 1/4 8.5
1 1 1 1 4 + 5.5 + 7 + 8.5 25
4 + 5.5 + 7 + 8.5 = = = 6.25
4 4 4 4 4 4
20
10
Commonly reported
statistical results
List 1: (10 trials)
6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5
6
Freq
0
3 5 6 7 8 9 10
0 unnamed 9
21
11