Numerical Measures To Describe Data
Numerical Measures To Describe Data
Chapter 2
Mode Variance
Standard Deviation
Coefficient of Variation
x i
x= i=1
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
(if one exists)
Copyright © 2013 Pearson Education Ch. 2-6
Arithmetic Mean
◼ The arithmetic mean (mean) is the most
common measure of central tendency
◼ For a population of N values:
N
xx1 + x 2 + + x N
i Population
μ= =
i=1
values
N N
Population size
x i
x1 + x 2 + + x n Observed
x= i=1
= values
n n
Sample size
Copyright © 2013 Pearson Education Ch. 2-7
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 + 2 + 3 + 4 + 5 15 1 + 2 + 3 + 4 + 10 20
= =3 = =4
5 5 5 5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
n +1
◼ Note that is not the value of the median, only the
2
position of the median in the ranked data
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Copyright © 2013 Pearson Education Ch. 2-11
Review Example
$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000
$100 K
$100 K
House Prices:
◼ Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000 ◼ Median: middle value of ranked data
Sum 3,000,000
= $300,000
1 i
( x − x ) 3
Skewness = i =1
n s3
◼ Geometric mean
◼ Used to measure the rate of change of a variable
over time
rg = (x1 x 2 )1/n − 1
Geometric
= [(1.50) (1.20)]1/2 − 1
mean rate
of return: = (1.8)1/2 − 1 = 1.3416 − 1 = 34.16%
Accurate
result
Copyright © 2013 Pearson Education Ch. 2-21
Example 2.4 Annual Growth Rate (Geometric Mean)
Find the annual growth rate if sales have grown 25% over 5
years.
Q1 Q2 Q3
◼ The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
◼ Q2 is the same as the median (50% are smaller, 50% are
larger)
◼ Only 25% of the observations are greater than the third
quartile
(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
20 73 75 80 82
13 15 8 16 8
6 8 10 12 14 9 11 7 13 11
Variation
Same center,
different variation
Copyright © 2013 Pearson Education Ch. 2-37
Sample A: 1 2 1 36
Sample B: 8 9 10 13
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
◼ Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
IQR = Q3 - Q1
Example:
Median X
X Q1 Q3 maximum
minimum (Q2)
25% 25% 25% 25%
12 30 45 57 70
σ =2 i=1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Copyright © 2013 Pearson Education Ch. 2-45
Sample Variance
s =
2 i=1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright © 2013 Pearson Education Ch. 2-46
Population Standard Deviation
i
(x − μ) 2
σ= i=1
N
Copyright © 2013 Pearson Education Ch. 2-47
Sample Standard Deviation
i
◼ Sample standard deviation:
(x − x) 2
S= i=1
n -1
11 12 13 14 15 16 17 18 19 20 21
s = 3.338
(compare to the two
Data A cases below)
11 12 13 14 15 16 17 18 19 20 21
s = 0.926
(values are concentrated
Data B near the mean)
s = 4.570
11 12 13 14 15 16 17 18 19 20 21 (values are dispersed far
Data C from the mean)
◼ Enter input
range details
◼ Click OK
Copyright © 2013 Pearson Education Ch. 2-56
Excel output
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
◼ Standard deviation = $5
s $5
CVA = 100% = 100% = 10%
x $50 Both stocks
◼ Stock B: have the same
standard
◼ Average price last year = $100 deviation, but
stock B is less
◼ Standard deviation = $5 variable relative
to its price
s $5
CVB = 100% = 100% = 5%
x $100
Copyright © 2013 Pearson Education Ch. 2-59
Basic Exercises
6 8 7 10 3 5 9 8
10 8 11 7 9
23 35 14 37 38 15 45
12 40 27 13 18 19 23
37 20 29 49 40 65 53
18 17 23 27 29 31 42
35 38 22 20 15 17 21
At least within
(1 - 1/1.52) = 55.6% ……... k = 1.5 (μ ± 1.5σ)
(1 - 1/22) = 75% …........... k = 2 (μ ± 2σ)
(1 - 1/32) = 89% …….…... k = 3 (μ ± 3σ)
68%
μ
μ 1σ
Copyright © 2013 Pearson Education Ch. 2-66
The Empirical Rule
(continued)
◼ μ 2σ contains about 95% of the values in
the population or the sample
◼ μ 3σ contains almost all (about 99.7%) of
the values in the population or the sample
95% 99.7%
μ 2σ μ 3σ
xi - μ
z=
σ
x i - μ 121 - 100
z= = = 1.4
σ 15
A score of 121 is 1.4 standard
deviations above the mean.
b. If the data are mounded, use the empirical rule to find the
approximate percent of observations between 65 and 85.
w x i i
w 1x1 + w 2 x 2 + + w n x n
x= i=1
=
n n
◼ Where wi is the weight of the ith observation
and n = w i
fimi where
K
n = fi
x= i=1 i=1
i i
f (m − x) 2
s2 = i=1
n −1
Class
mi fi f i mi (mi − x ) ( mi − x ) 2 f i (mi − x ) 2
0-4 2 5 10 -10.625 112.8906 564.4531
5-9 7 8 56 -5.625 31.64063 253.125
10-14 12 11 132 -0.625 0.390625 4.296875
15-19 17 9 153 4.375 19.14063 172.2656
20-24 22 7 154 9.375 87.89063 615.2344
40 505 1609.375
K
fi ( mi − xi )
2
1609.375
s =
2 i =1
= = 41.266
n −1 39
s = s 2 = 41.266 = 6.424
̅x =12,625
Copyright © 2013 Pearson Education Ch. 2-91
2.4
Measures of Relationships
Between Variables
◼ Covariance
◼ a measure of the direction of a linear relationship
between two variables
◼ Correlation Coefficient
◼ a measure of both the direction and the strength of a
linear relationship between two variables
(x − i x )(y i − y )
Cov (x , y) = xy = i=1
N
◼ The sample covariance:
(x − x)(y i i − y)
Cov (x , y) = s xy = i=1
n −1
◼ Unit free
◼ Ranges between –1 and 1
◼ The closer to –1, the stronger the negative linear
relationship
◼ The closer to 1, the stronger the positive linear
relationship
◼ The closer to 0, the weaker any positive linear
relationship
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Copyright © 2013 Pearson Education Ch. 2-98
Figure 2.5 Retail Sales by Quarter
(11, 52) (13, 72) (14, 62) (15, 82) (17, 92) (13, 62) (15, 72)
95
Test #2 Score
◼ There is a relatively 90
85
relationship between 75
test score #1 70
70 75 80 85 90 95 100
Test #1 Score
and test score #2