Representation of Data
A-Math 1
You are expected to be able to:
• select a suitable way of presenting raw statistical data, and discuss
advantages and/or disadvantages that particular representations may
have
• draw and interpret stem-and-leaf diagrams, box-and-whisker plots,
histograms and cumulative frequency graphs
Types of Data
• Qualitative Data
• Usually called as categorical data
• Described by words
• Non-numerical, such as color, hobbies, etc.
• Quantitative Data (numerical data)
• Discrete data
• Continuous data
Types of Data
• Quantitative Data (numerical data)
• Discrete data
Data that always represented using a certain numbers, such as integer letters.
There is no decimals and fraction letters.
Example:
Number of people, etc.
• Continuous data
Data that could be represented as an interval. That is why, this data is usually in
terms of range.
Example:
Height, etc.
Stem-and-leaf diagram
Stem: all digits, except the last digit of a number
Leaf: the last digit of a number
This diagram is usually used for comparison of two set of
data, and its raw data still can be seen.
How do we do it?
Stem-and-leaf diagram
How do we do it?
• Analyze how many digits are there
• Decide which will be the stem and
which will be the leaf Stem Leaf
• Rearrange the leaf in an ascending 5 3, 5, 8, 8
order 6 1, 1, 2, 4, 7, 9
Example: 7 2, 7, 9
8
Weight of students in a class (kg)
9 2, 7
58, 55, 58, 61, 72, 79, 97, 67, 61, 77,
92, 64, 69, 62 and 53 Key: 5 ȁ3 is 53 kg
Stem-and-leaf diagram
2016 2017
9, 8, 5, 1, 0 0 1, 2, 2, 3, 4, 6, 7, 8, 8, 9
7, 6, 3, 2, 1, 0 1 1, 3
0 2
Key: 0 ȁ1 ȁ1 means 10 rainfalls in a particular month in 2016 and
11 rainfalls in a particular month in 2017
Classwork
1.2: 5, 6, 7 – Classwork – due on 23rd of July
Histograms
Histograms
• Should be a continuous data
• No gap between one boundary to the
next one
• Usually represented by ‘<, >, ≤, ≥’
Histograms
145.5 ⩽ ℎ < 150.5
150.5 ⩽ ℎ < 155.5
155.5 ⩽ ℎ < 160.5
Boundaries = 145.5, 150.5, 155.5
Class widths = 150.5 − 145.5 = 5, etc.
Histograms
145.5 ⩽ ℎ < 150.5
150.5 ⩽ ℎ < 155.5
155.5 ⩽ ℎ < 160.5
Mid-values
150.5+145.5
= 148
2
150.5+155.5
= 153
2
155.5+160.5
= 158
2
Histograms
100 ≤ ℎ < 150
150 ≤ ℎ < 200
200 ≤ ℎ < 250
250 ≤ ℎ < 300
300 ≤ ℎ < 350 OR 100 < ℎ ≤ 150
150 < ℎ ≤ 200
200 < ℎ ≤ 250
250 < ℎ ≤ 300
300 < ℎ ≤ 350
Histogram
Let’s construct a histogram!
Mass (kg)
No. Package
Note:
Using ‘frequency’ is only
when the class width is
the same for all interval
Mass (kg)
Histogram
Let’s construct a histogram!
Better to use frequency density instead of frequency!
Mass (kg)
No. Package
Mass Frequency (No. package) 𝐜𝐥𝐚𝐬𝐬 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲
Frequency density = 𝐜𝐥𝐚𝐬𝐬 𝐰𝐢𝐝𝐭𝐡
16 ≤ 𝐴 < 18 34 34
= 17
2
18 ≤ 𝐴 < 20 46 46
= 23
2
20 ≤ 𝐴 < 22 20 20
= 10
2
Mass (kg) Frequency (No. package) 𝐜𝐥𝐚𝐬𝐬 𝐟𝐫𝐞𝐪𝐮𝐞𝐧𝐜𝐲
Frequency density = 𝐜𝐥𝐚𝐬𝐬 𝐰𝐢𝐝𝐭𝐡
34
Histogram 16 ≤ 𝐴 < 18 34
2
= 17
18 ≤ 𝐴 < 20 46 46
= 23
2
20 ≤ 𝐴 < 22 20 20
= 10
2
class frequency
Density frequency =
Frequency density
class width
class frequency (Area)
= density frequency × class width
Histogram
Histogram
Histogram
To answer question (b), we can use the concept of area in terms of
frequency density.
Area in terms of frequency density could be used to find the
approximated frequency within the class width.
Area I
50 − 45 × 4 = 5 × 4 = 20
Area II
Area I Area II
63 − 50 × 3 = 13 × 3 = 39
Estimation: 20 + 39 = 59 children
Histogram
Histogram
Classwork/Homework
1.3: 2, 3, 5, 7, 9, 10