Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views19 pages

Lecture (4) Chapter (3) Part 2

Chapter 3 discusses statistical techniques for describing data, focusing on population variance, standard deviation, measures of dispersion, and the coefficient of skewness. It provides examples for calculating these statistics using frequency tables and explains the significance of the coefficient of variation and box plots in identifying data distribution shapes and outliers. The chapter concludes with a class work example that illustrates the calculation of various statistical measures from sample data.

Uploaded by

kerbrosea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views19 pages

Lecture (4) Chapter (3) Part 2

Chapter 3 discusses statistical techniques for describing data, focusing on population variance, standard deviation, measures of dispersion, and the coefficient of skewness. It provides examples for calculating these statistics using frequency tables and explains the significance of the coefficient of variation and box plots in identifying data distribution shapes and outliers. The chapter concludes with a class work example that illustrates the calculation of various statistical measures from sample data.

Uploaded by

kerbrosea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Chapter 3

Describing Data through


Statistics
“ Numerical descriptive techniques “
Part ( 2 )
3) Population Variance and Standard Deviation :
Find the variance and standard deviation for the following population data :
35 45 30 35 40 25
35 45 30 35 40 25
Sample Variance and Standard Deviation
b) Measures of dispersion [ spread ] [ variability ] for Quantitative grouped data :
1) range :

Use Midpoints

Range of Grouped Data = Midpoint max – Midpoint min


2) Population Variance and Standard Deviation :
( 𝜮 𝒇𝑴 )𝟐
𝚺 𝒇 𝑴𝟐 −
variance 𝝈 = 𝟐 𝑵
𝑵

standard deviation 𝝈= 𝝈𝟐

Sample Variance and Standard Deviation :


( 𝜮 𝒇𝑴 )𝟐
𝚺 𝒇 𝑴𝟐 −
variance 𝑺𝟐 = 𝒏
𝒏−𝟏

standard deviation 𝑺= 𝑺𝟐
Example
Using the following frequency table for population data , calculate … range , variance and
standard deviation.

Solution
Class interval f Midpoint (𝑴) 𝒇𝑴 𝒇 𝑴𝟐
1 up to 3 4 2 8 16
3 up to 5 12 4 48 192
5 up to 7 13 6 78 468
7 up to 9 19 8 152 1216
9 up to11 7 10 70 700
11 up to 13 5 12 60 720
total 60 416 3312

Range = 12 – 2 = 10
( 𝜮 𝒇𝑴 )𝟐 ( 𝟒𝟏𝟔)𝟐
𝚺 𝒇 𝑴𝟐 − 𝟑𝟑𝟏𝟐 −
variance 𝝈𝟐 = 𝑵
= 𝟔𝟎
= 7.13
𝑵 𝟔𝟎

standard deviation 𝝈= 𝝈𝟐 = 𝟕. 𝟏𝟑 = 2.67


Measure of shape :
coefficient of skewness

it is used to describe the shape of a distribution of data.


3( 𝜇 − 𝑚𝑒𝑑𝑖𝑎𝑛 )
for population data SK =
𝜎
3( 𝑥 − 𝑚𝑒𝑑𝑖𝑎𝑛 )
for sample data SK =
𝑠

- if the result of SK is negative , then the shape of data is negatively skewed ( skewed left )

( long left tail ) .

- if the result of SK is positive , then the shape of data is positively skewed ( skewed right)

( long right tail ) .

- if the result of SK is zero , then the shape of data is symmetrical distribution ( bell shaped )

( normal distribution ) .
Example

if you have a sample data and it has a mean of 7 , median = 8 and standard deviation = 3.32 …
calculate the coefficient of skewness and comment on the shape of a distribution of data.

solution
3( 𝑥 − 𝑚𝑒𝑑𝑖𝑎𝑛 ) 3( 𝟕−𝟖 )
SK = = = - 0.9
𝑠 𝟑.𝟑𝟐

then the distribution of data is negatively skewed .


coefficient of variation :

- The coefficient of variation (CV) is the ratio of the standard deviation to the mean and
shows the extent of variability in relation to the mean of the population or sample.
- The higher the CV, the greater the dispersion.
CV = Standard Deviation / Mean
𝜎
CV = * 100 for population
𝜇
𝐬
CV = * 100 for sample
𝐱
Example
The mean of a math test is 89 with a standard deviation 12. the mean of a language test is 68
with a standard deviation 10 . Which test has more variability ?
Solution
𝟏𝟐
for math test CV = * 100 = 13.48%
𝟖𝟗
𝟏𝟎
for language test CV = * 100 = 14.7%
𝟔𝟖

Variability in language test is higher


Box and Whisker plots and five number summary :
A box and whisker plot (also called a box plot ) displays the five-number summary of a set of data.
The five-number summary is the minimum, first quartile, median, third quartile, and
maximum. In a box plot, we draw a box from the first quartile to the third quartile. A vertical line
goes through the box at the median.

Lower outer fence = Lower inner fence = Upper inner fence = Upper outer fence =
Q1 – 3 IQR Q1 – 1.5 IQR Q3 + 1.5 IQR Q3 + 3 IQR

Extremes Extremes
we draw the box plot to identify

1. outliers and extreme values .

2. shape of data .

Solution steps

1. calculate Q1 , Q2 , Q3 .

2. IQR = Q3 – Q1

3. Upper inner fence = Q3 + 1.5 IQR (assign outliers) ,Upper outer fence = Q3 + 3 IQR (assign Extremes)

4. Lower inner fence = Q1 – 1.5 IQR (assign outliers) , Lower outer fence = Q1 – 3 IQR (assign Extremes)

example

let us draw the box plot for the following data where ,

minimum = 10 maximum = 90 Q1 = 23 Q3 = 64.25 median = 55

comment on the shape of data and outliers .


Solution

1. IQR = Q3 – Q1 = 64.25 – 23 = 41.25

2. upper inner fence = Q3 + 1.5 IQR = 64.25 + 1.5 * 41.25 = 126.125

3. lower inner fence = Q1 – 1.5 IQR = 23 – 1.5 * 41.25 = - 38.87

Lower inner fence Upper inner


= -38.87 fence=126.125

10 23 55 64.25 90
Shape of distribution

• ( median – Q1 ) > ( Q3 – median ) the shape is negatively skewed.

• ( median – Q1 ) < ( Q3 – median ) the shape is positively skewed.

• ( median – Q1 ) = ( Q3 – median ) the shape is symmetric.

for our example Q1 = 23 Q3 = 64.25 median = 55

( 55 – 23 )=32 > ( 64.25- 55 )=9.25 ( the shape is negatively skewed )

Outliers ( extreme values )

• any number greater than the upper inner fence is an outlier .

• any number less than the lower inner fence is an outlier .

for our example minimum = 10 > Lower fence = -38.87

maximum = 90 < Upper fence=126.125 ( there is no outliers )


Chapter 3
Class Work (3)
Example
consider the following sample data , find the mean , median , mode , rang , variance , standard
deviation , coefficient of skewness with comment on data shape and coefficient of variation .

Class interval f
10 up to 15 6
15up to 20 22
20 up to 25 35
25 up to 30 29
30 up to 35 16
35up to 40 8
40 up to 45 4
45 up to 50 2
Total
solution

Class interval f Midpoint (𝑴) 𝒄. 𝒇 𝒇𝑴 𝒇 𝑴𝟐


10 up to 15 6 12.5 6 75 937.5

15up to 20 22 17.5 28 385 6737.5

20 up to 25 35 22.5 63 787.5 17718.75

25 up to 30 29 27.5 92 797.5 21931.25

30 up to 35 16 32.5 108 520 16900

35up to 40 8 37.5 116 300 11250

40 up to 45 4 42.5 120 170 7225.5

45 up to 50 2 47.5 122 95 4512.5

Total 122 3130 87212.5


𝚺 𝒇𝑴 𝚺 𝒇𝑴 𝟑𝟏𝟑𝟎
Mean = = = = 25.66
𝚺𝒇 𝒏 𝟏𝟐𝟐
𝚺𝒇
𝟐
− 𝒄.𝒇 𝟔𝟏 −𝟐𝟖
Median = L + [ ] * w = 20 + [ ] * 5 = 24.71
𝒇 𝒎𝒆𝒅𝒊𝒂𝒏 𝟑𝟓

Mode = 22.5

Mode < Median < Mean Then the data is positively skewed or skewed right

Range = 47.5 – 12.5 = 35

( 𝜮 𝒇𝑴 )𝟐 ( 𝟑𝟏𝟑𝟎)𝟐
𝚺 𝒇 𝑴𝟐 − 𝒏 87212.5 −
variance 𝑺𝟐 = = 𝟏𝟐𝟐
= 57.11
𝒏−𝟏 𝟏𝟐𝟐−𝟏

standard deviation 𝑺= 𝑺𝟐 = 𝟓𝟕. 𝟏𝟏 = 7.56

3( 𝑥 − 𝑚𝑒𝑑𝑖𝑎𝑛 ) 3( 𝟐𝟓.𝟔𝟔−𝟐𝟒.𝟕𝟏)
SK = = = 0.38 then the distribution of data is positively skewed
𝑠 𝟕.𝟓𝟔

𝑆 𝟕.𝟓𝟔
C.V = * 100 = * 100 = 29.46%
𝐱 𝟐𝟓.𝟔𝟔
Chapter 3

The end

You might also like