Centre For Foundation Studies
Department of Sciences and Engineering
FHMM1214 Mathematics for
Social Science
Topic 1
Descriptive Statistics
1
Content
1.1 What is Statistics?
1.2 Population Versus Sample
1.3 Basic Terms
1.4 Types of Variables
1.5 Raw Data
1.6 Organizing and Graphing Qualitative Data
1.7 Organizing and Graphing Quantitative Data
1.8 Shapes of Histograms
1.9 Cumulative Frequency Distributions
2
1.10 Stem-and-Leaf Displays
1.1
What is Statistics ?
1st Meaning of Statistics
The word ‘statistics’ has 2 meanings.
1. Statistics refers to numerical facts.
The age of a student.
The number of students enrolled in UTAR.
The income of a family.
The percentage of passes in a statistics class.
4
2nd Meaning of Statistics
2. Statistics refers to the field or
discipline of study.
Statistics is a group of methods used to
collect, analyze, present, and interpret
data and to make decisions.
5
1.2
Population Versus
Sample
Population
PopulationVersus
and Sample
Sample
Population or Target Population
Consists of all elements (individuals, items,
or objects) whose characteristics are being
studied.
Sample
A portion of the population selected for
study.
7
Illustration
8
1.3
Basic Terms
Definition
Element or Member
An element or member of a sample or
population is a specific subject or
object (e.g. a person, firm, item, state, or
country) about which the information is
collected.
Variable
A variable is a characteristics under study that
assumes different value for different elements.
10
Definition
Observation or Measurement
The value of a variable for an element.
Data Set
A data set is a collection of observations on
one or more variables.
11
SUMMARY
Population or Target Population
Consists of all elements (individuals, items, or objects) whose characteristics are
being studied.
Sample
A portion of the population selected for study.
Element or Member
An element or member of a sample or population is a specific subject or object
(e.g. a person, firm, item, state, or country) about which the information is
collected.
Variable
A variable is a characteristics under study that assumes different value for
different elements.
Observation or Measurement
The value of a variable for an element.
Data Set
A data set is a collection of observations on one or more variables.
Example
13
Example
Problem
The following table gives the scores of five
students on a statistics test.
Student Score i) What is the variable for
Kevin 83 this data set?
ii) How many observations
Susan 91
does this data set
David 78 contain?
Jeff 69 iii) How many elements
Johan 87 does this data set
contains? 14
Solution
15
1.4
Types of Variables
Quantitative Variables
Definition
• A variable that can be measured
numerically is called quantitative
variable.
• The data collected on a quantitative
variable is called quantitative data.
17
Quantitative Variables
a) Discrete Variable
A variable whose values are countable is
called a discrete variable. In other words,
a discrete variable can assume only
certain values with no intermediate values.
Example: The number of students in a class etc.
18
Quantitative Variables
b) Continuous Variable
A variable that can assume any numerical
value over a certain interval is called a
continuous variable.
Example:
The height of a person etc.
The time taken to complete an examination.
The yield of potatoes (in pounds) per acre.
19
Qualitative / Categorical Variables
Definition
• A variable that cannot assume numerical value
but can be classified into two or more non-
numeric categories.
• The data collected on such a variable are called
qualitative data.
Example: Gender of a person, hair color
20
Exercise
Determine whether the following is a Population or Sample
and hence, identify the following as Qualitative,
Quantitative Discrete, or Quantitative Continuous
variables.
a) Annual income of all employees of a restaurant.
b) Number of subjects taken by students selected in a class.
c) Name of all students in a school.
d) Weights of 50 kids selected in the kindergarten.
e) Time taken to complete a test by all students in a class.
21
Solution
22
Illustration
23
1.5
Raw Data
Definition
RAW DATA
Data recorded in the sequence in which
they are collected and before they are
processed or ranked are called raw data.
25
Raw Data (quantitative data)
26
Raw Data (qualitative data)
27
1.6
Organizing &
Graphing Qualitative
Data
Example 1
A sample of 30 employees were asked how stressful their
jobs were. Their responses are recorded below.
Somewhat none somewhat very very none
very somewhat somewhat very somewhat somewhat
very somewhat none very none somewhat
somewhat very somewhat somewhat very none
Somewhat very very somewhat none somewhat
Construct a frequency distribution table for these data.
29
Example 1 (Solution)
30
Relative Frequency &
Percentage Distributions
Tabular arrangement that lists the
relative frequencies and percentages
for all categories.
frequency of that category f
relative frequency of a category
sum of all frequencie s f
Percentage relative frequency 100
31
Example 1 (Solution)
f
10
14
6
Sum = 30
32
Exercise
The following data give the results (in grade) of 20
students in Mathematics Test.
A C A B F
B A B C B
A B C F A
B B C C B
a) Construct a frequency distribution table.
b) Calculate the relative frequencies and percentages
for the results.
33
Solution
Grade Frequency Relative Percentage
Frequency (%)
A
34
Exercise
35
Frequency, Relative Frequency, Percentage
Distributions Table of Students’ Status
Relative
Status Frequency Percentage
Frequency
F
SO
J
SE
sum
Revision exercise
In a survey, 120 Malaysian adults were asked to rate their health.
The table below summarizes their responses.
State of Health Percentage of Response
Excellent 17.5
Very good 37.5
Good 32.5
Fair 10.0
Poor 2.5
Find the number of adults who was in an excellent health
condition.
37
1.7
Organizing &
Graphing Quantitative
Data
Ungrouped Frequency Distribution
Frequency Distributions for Quantitative Data
Single-Valued Classes
Are used if the observations in a data set assume
only a few distinct (integer) values
( i.e. classes are made of single values and not of
intervals).
39
Example 2
The Number of Vehicles Owned by 40 Households
from a City
5 1 1 2 0 1 1 2 1 1
1 3 3 0 2 5 1 2 3 4
2 1 2 2 1 2 2 1 1 1
4 2 1 1 2 1 1 4 1 3
Construct a frequency distribution table for these data.
40
Example 2 (Solution)
Number of Households
Vehicles Owned
(f)
0 2
1 18
2 11
3 4
4 3
5 2
Sum 40
41
Bar Graph
42
Grouped Frequency Distribution
Grouped Frequency Distribution
• Lists all the classes and the number of
values that belong to each class.
• Data presented in the form of a frequency
distribution are called grouped data.
43
Example 3
Weekly Earnings of 100 Employees of a Company
401 410 448 450 490 505 521 555 600 601
605 610 620 625 630 650 678 680 685 690
700 725 750 760 770 780 785 790 795 798
800 801 805 809 810 810 814 815 820 825
828 830 835 840 845 850 855 860 865 870
880 888 890 895 900 910 920 930 935 940
950 956 959 960 965 967 970 980 995 1000
1010 1020 1030 1055 1068 1070 1079 1090 1100 1110
1120 1130 1155 1167 1180 1230 1250 1259 1270 1290
1300 1320 1350 1400 1410 1460 1500 1541 1560 1600
Construct a frequency distribution table for these data.
Example 3 (Solution)
Number of Employees
Weekly Earnings
(f)
401 – 600 9
601 – 800 22
801 – 1000 39
1001 – 1200 15
1201 – 1400 9
1401 – 1600 6
Sum 100
45
Relative Frequency &
Percentage Distributions
The relative frequencies and percentages for
a quantitative data set are obtained as follows:
frequency of that category f
relative frequency of a category
sum of all frequencie s f
Percentage relative frequency 100
46
Illustration 1
Illustration 2
Class Boundaries f
134.5 – 156.5 10
156.5 – 178.5 3
178.5 – 200.5 7
200.5 – 222.5 6
222.5 – 244.5 4
Sum = 30
48
Example 4
49
Example 4 (Solution)
Find the frequency, relative frequency and
percentage for all classes.
Relative
Age Frequency Percentage
Frequency
18 – 21
22 – 25
26 – 29
30 – 33
34 – 37
sum
Definition
Class
An interval that includes all the values that fall within
two numbers, the lower and upper limits
Class limits
Endpoints of each interval
Class Boundary
The dividing line between two classes and is given
by the midpoint of the upper limit of one class and
the lower limit of the next higher class.
51
Definition
Class width / class size
The difference between the upper and lower
class boundary.
Class mark / class midpoint
The midpoint of the class interval.
Lower Limit Upper Limit
Class midpoint or class mark
2
52
Example 5
53
Example 5 (Solution)
Class Boundaries
400.5 – 600.5
600.5 – 800.5
800.5 – 1000.5
1000.5 – 1200.5
1200.5 – 1400.5
1400.5 – 1600.5
54
Example 6
Class Lower Upper Class
Midpoint
interval boundary boundary width
10 11 15 16 11 15
11 – 15 10.5 15.5 13 15.5 – 10.5 = 5
2 2 2
16 – 20 15 16 20 21 16 20
15.5 20.5 18 20.5 – 15.5 = 5
2 2 2
20 21 25 26 21 25
21 – 25 20.5 25.5 23 25.5 – 20.5 = 5
2 2 2
25 26 30 31 26 30
26 – 30 25.5 30.5 28 30.5 – 25.5 = 5
2 2 2
55
Exercise
Class Lower Upper Class
Midpoint
interval boundary boundary size
0–9
10 – 19
20 – 29
30 – 39
56
Solution
Class Lower Upper Class
Midpoint
interval boundary boundary size
0–9 - 0.5 9.5 4.5 10
10 – 19 9.5 19.5 14.5 10
20 – 29 19.5 29.5 24.5 10
30 – 39 29.5 39.5 34.5 10
57
Exercise
Find the class boundaries and class limits.
a) Number of books 2–3 4–5 6–7 8 – 9 10 – 11
Frequency 10 12 8 4 2
b) Weight (kg) 40 – <50 50 – <60 60 – <70 70 – <80 80 – <90
Frequency 10 12 8 4 2
58
Solution
(a)
Number of frequency Class boundaries Class limit
books
2–3 10
4–5 12
6–7 8
8–9 4
10 – 11 2
59
Solution
(b)
Weight (kg) frequency Class boundaries Class limit
40 – <50 10
50 – <60 12
60 – <70 8
70 – <80 4
80 – <90 2
60
Revision exercise
The following table gives the frequency
distribution of ages for all 50 employees of a
company.
Age No. of Employees
18 to 30 12
31 to 43 19
44 to 56 14
57 to 69 5
61
Revision exercise
a) Find the class boundaries and class midpoints.
b) Construct a relative frequency and percentage table.
c) Do all classes have the same width? If yes, what is
that width?
d) What is the percentage of the employees of this
company are age 43 or younger?
62
Solution
Class Class Relative Percentage
Age Midpoint frequency
boundaries width frequency (%)
18 – 30 12
31 – 43 19
44 – 56 14
57 – 69 5
Sum =
50
63
Graphing Grouped Data
Grouped (quantitative) data can be
displayed in a histogram or a polygon.
Histogram
Three types of histogram
1. Frequency histogram
2. Relative frequency histogram
3. Percentage histogram
64
Histogram
• A histogram is a graph in which class boundaries
are marked on the horizontal (x) axis & the
frequencies, relative frequencies, or
percentages are marked on the vertical (y) axis.
• The frequencies, relative frequencies, percentages
are represented by the heights of the bars.
• The bars are drawn adjacent to each other.
65
Illustration 3
Total Home Class Relative Percentage
Frequency
Runs boundaries Frequency (%)
135 – 156 134.5 – 156.5 10 0.3333 33.33
157 – 178 156.5 – 178.5 3 0.1000 10.00
179 – 200 178.5 – 200.5 7 0.2333 23.33
201 – 222 200.5 – 222.5 6 0.2000 20.00
223 – 244 222.5 – 244.5 4 0.1333 13.33
Sum = 30 Sum = 0.999 Sum = 99.9%
Frequency Histogram
Frequency may be used as the height of rectangle
134.5 156.5 178.5 200.5 222.5 244.5
67
Relative Frequency Histogram
134.5 156.5 178.5 200.5 222.5 244.5
68
Percentage
Percentage Histogram
134.5 156.5 178.5 200.5 222.5 244.5
69
Polygon
• A graph formed by joining the midpoints of
the tops of successive bars in a histogram.
• Next, we mark two more classes (with zero
frequencies), one at each end, and mark the
midpoints.
70
Polygon
134.5 156.5 178.5 200.5 222.5 244.5
Total home runs
Example 7
The marks obtained by 134 students in an examination
is recorded in the following table.
20 30 40 50 60 70 80
Marks – – – – – – –
29 39 49 59 69 79 89
Frequency 22 18 22 24 14 14 20
Construct a histogram for the frequency distribution.
72
Example 7 (Solution)
Class
Marks frequency
boundaries
20 – 29 19.5 – 29.5 22
30 – 39 29.5 – 39.5 18
40 – 49 39.5 – 49.5 22
50 – 59 49.5 – 59.5 24
60 – 69 59.5 – 69.5 14
70 – 79 69.5 – 79.5 14
80 – 89 79.5 – 89.5 20
73
Example 7 (Solution)
74
Exercise
The table below shows the ages distribution for 30
participants in a game. Draw a histogram for frequency
distribution.
Age (Years) Frequency
6 – 10 2
11 – 15 7
16 – 20 8
21 – 25 6
26 – 30 3
31 – 35 4
75
Solution
Class
Age frequency
boundaries
6 – 10 5.5 – 10.5 2
11 – 15 10.5 – 15.5 7
16 – 20 15.5 – 20.5 8
21 – 25 20.5 – 25.5 6
26 – 30 25.5 – 30.5 3
31 – 35 30.5 – 35.5 4
76
Histogram for the frequency distribution for the age (years) of 30
participants in a game
9
5
Frequency
0
5.56 – 1010.511 – 1515.516 – 2020.521 – 25
25.526 – 30
30.531 – 35
35.5 77
Age
Revision exercise
Weekly Earnings of 100 Employees of a Company
Weekly Earnings Number of Employees
(dollars) (f)
401 – 600 9
601 – 800 22
801 – 1000 39
1001 – 1200 15
1201 – 1400 9
1401 – 1600 6
Sum 100
Construct a histogram and Polygon for the frequency distribution.
Revision exercise
The marks obtained by 120 students in an
examination is recorded in the following
table.
Marks 20-29 30-39 40-49 50-59 60-69 70-79 80-89
Frequency 12 18 24 20 18 16 12
Construct a percentage histogram to
represent the above information.
Solution
80
1.8
Shapes of Histograms
Symmetric Histogram
It is identical on both sides of its central point.
82
Skewed Histogram
It is asymmetric and the tail on one side is
longer than the tail on the other side.
83
1.9
Cumulative Frequency
Distributions
Definition
A cumulative frequency distribution
gives the total number of values that fall
below the upper boundary of each class.
85
Example
Example 68
Prepare a cumulative frequency distribution for the
following frequency distribution.
86
Example 8 (Solution)
f
10
3
7
6
4
Sum = 30
87
Cumulative Relative Frequency &
Cumulative Percentage
Cumulative frequency of a class
Cumulative relative frequency =
Total observations in the data set
Cumulative percentage = (Cumulative relative frequency) x 100
88
Example 8 (Solution)
c.f.
10
13
20
26
30
89
Ogive
An ogive is a curve drawn for the cumulative
frequency distribution by joining with straight
lines the dots marked above the upper
boundaries of classes at heights equal to the
cumulative frequencies of respective classes.
90
Cumulative Frequency Curve
(Ogive)
There are two types of cumulative frequency
curves:
1) ‘less than’ cumulative frequency curve
2) ‘more than’ cumulative frequency curve
91
Example 9
Construct a ‘less than’ ogive for the data below.
92
Example 9 (Solution)
Upper boundary ‘Less than’ cumulative
frequency
<134.5 0
<156.5 10
<178.5 13
<200.5 20
<222.5 26
<244.5 30
93
‘Less than’ Ogive
94
Example 10
Using the data given below, construct a ‘less than’ cumulative
frequency distribution and draw the ogive.
Marks 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 – 70 71 – 80
Number of
3 8 12 14 10 6 5 2
Students ( f )
Estimate from the ogive,
(i) the number of students who score less than 60 marks.
(ii) the number of students who score more than or equal to 60 marks.
(iii) the value of x, if 20% of the students score less than x marks.
(iv) the value of x, if 80% of the students score more than or equal to x marks.
95
‘Less than’ Cumulative Frequency Distribution
‘Less than’
Upper boundary cumulative
frequency
Marks Frequency Less than 0.5 0
1 – 10 3 < 10.5 3
11 – 20 8 < 20.5 11
21 – 30 12 < 30.5 23
31 – 40 14 < 40.5 37
41 – 50 10 < 50.5 47
51 – 60 6 < 60.5 53
61 - 70 5 < 70.5 58
71 – 80 2 < 80.5 60
Sum 60
“Less than” ogive for the cumulative frequency distribution for the
marks scored by 60 students
Example 10 (Solution)
(i) Approximately 52 students score less than 60 marks.
(ii) Approximately 8 students score at least 60 marks.
(iii) 20% of students (12 students) obtain less than x marks;
From the graph, x = 21
(iv) 80% of students (48 students) obtain at least x marks,
20% of the students (12 students) obtain less than x marks.
From the graph, x = 21
80% (48) students score at least 21 marks.
98
Revision exercise
Draw a histogram and a ‘less than’ cumulative
frequency curve based on the following frequency
distribution.
Number of Students
Marks
(f)
0 ≤ x < 20 30
20 ≤ x < 40 40
40 ≤ x < 60 50
60 ≤ x < 80 60
80 ≤ x <100 20
Sum = 200
99
Solution
100
1.10
Stem-and-Leaf Displays
Definition
In a stem-and-leaf display of quantitative
data, each value is divided into two portions
– a stem and a leaf.
The leaves for each stem are shown
separately in a display.
102
Example 11
The following are the scores of 30 college students
on a statistics test.
75 52 80 96 65 79 71 87 93 95
69 72 81 61 76 86 79 68 50 92
83 84 77 64 71 87 72 92 57 98
Construct a stem-and-leaf display.
103
Stem-and-leaf display for two-digit
numbers
104
Stem-and-leaf display for two-digit
numbers
5 2 0 7
6 9 1 4 5 8
7 5 2 7 6 1 9 1 9 2
8 3 4 0 1 6 7 7
9 6 2 3 5 2 8
105
Example 11 (Solution)
Key: 5|0 means 50
106
Example 12
The following are the age of 27 patients who had
the first heart attack.
65 40 63 67 75 79 85 45 90
60 55 67 86 55 49 78 76 54
67 98 56 45 50 85 67 72 83
Construct a stem-and-leaf plot.
107
Example 12 (Solution)
108
Example 13
The following data give the monthly rents paid by a
sample of 27 households selected from a small city.
880 1081 721 1075 1023 775 1235 750 965
960 1210 985 1231 932 850 825 1000 915
1191 1035 1151 1180 1175 952 1100 1140 860
Construct a stem-and-leaf display for these data.
109
Example 13 (Solution)
110
Example 14
The following stem-and-leaf display is prepared for the number of hours
that 25 students spent working on computers during the past month.
Stem Leaf
0 6 26 38 49 67 85
1 1 7 9
6 34 37 19 22
2 2 6
3 2 4 7 8 41 56 58 32 49
4 1 5 6 9 9 64 65 45 46 86
5 3 6 8
53 11 17 62 64
6 2 4 4 5 7
7
8 5 6
Key: 0 6 means 6
Prepare a new stem-and-leaf display by grouping the stems with class
interval 0 – 2, 3 – 5, 6 – 8.
Example 14 (Solution)
112
The End
of
Topic 1