Data Handling Learner Notes-1
Data Handling Learner Notes-1
MATHEMATICAL
LITERACY
LEARNER NOTES
DATA HANDLING
1
PLEASE NOTE:
It is of utmost importance that you study and know the definitions e.g. mean, mode and
range. The definition already explains the calculation that must be done.
Data is raw information that has been collected, without any organization of analysis. It is
unprocessed.
Data Handling refers to the process of collecting, organizing, summarising, representing and
analyzing information. It means gathering and recording information and then presenting it in
a way that is meaningful to others.
Data Handling
Interpreting
Developing Collecting Summarising Classifying and Representing
and analysin
questions data data organising data data
data
DEVELOPING QUESTIONS
The first step in the statistical process is to develop or pose questions.
When developing/posing the question, you must first identify the main question, followed by
sub-questions.
QUESTION 1 - EXAMPLE
Main question - what is the average monthly income of people in your community?
Sub-questions
In which age category do you fall?
In which sector/industry do you work?
What is your job title?
How long have you been working in this job?
QUESTION 2
Formulate 3 sub-questions for the main question below that will enable meaningful data
collection:
Are the expenses incurred for a Matric dance justified?
QUESTION 3
Formulate 3 sub-questions for the main question below that will enable meaningful data
collection:
How can your school's matric pass rate be improved?
2
COLLECTING DATA
Methods of collecting data:
1. Observation – e.g. counting the number of people entering a store. This is the method
of collecting data by watching and recording the results. The advantage of this method
is that you don’t interact with people to get the response.
2. Interview – e.g. asking your fellow learners their opinion of the design for your matric
jacket. The interviewer asks the interviewee questions and records the response. The
advantage of this method is that the interviewer may ask further questions if the
response is vague.
3. Survey – e.g. leaners complete a questioner on cool drink perverseness for the tuck
shop. A questionnaire is a tool used to conduct a survey and can be completed online,
in person, by telephone etc. Questions should not be long and must be clear. Answer
must also be concise. Questionnaires must be anonymous and confidential.
Questionnaires should be short and simple and not bias. This is a list of questions used
to collect data from the respondents. Participants do not have to identify themselves.
The advantage of using this method is that you get the information directly from the
participants.
Population – the entire group of interest e.g. all the leaners at school.
Sample – a representative part of the population e.g. randomly selects a number of people per
grade. A sample must be representative, randomly chosen, large enough and free from bias.
QUESTION 1
Susan will be managing the new tuck shop at your school, so she decided to hand out
questionnaires to the learners in order to do market research.
Draw up a questionnaire Susan can use in order to gather the information she requires.
QUESTION 2
A researcher is interested in the effect on a high sugar snack on the energy levels of primary
school learners. A group of 250 primary school learners were selected. Half are tested while
consuming the high sugar snack and the other half are tested without consuming the snack.
2.1 Identify the population
2.2 Identify the sample
3
CLASSIFYING DATA
Organising data is taking information and arranging it into some kind of order (such as
ascending or descending order).
Classifying data means organising it in groups or classes, based on some common feature.
NUMERICAL DATA:
CATEGORICAL DATA:
is generally descriptive in nature, as data is classified and organised into categories.
data is usually observed, but not measured.
examples: textures, smells, tastes, gender, eye color and country of birth.
categorical data can exist of “yes” and “no” answers.
4
SUMMARISING DATA
1 3 5 6 8
Median = 5
1 3 5 7 8 9
57
Median = =6
2
Mode = the value in the data set that appears the most
= there may be more than one mode or no mode at all
5
MEASURES OF SPREAD
1 2 3 4 5 6 7 8 9 10 11
Q1 Q2 Q3
Example B:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Q2
Q1 Q3
Q1 = 4 Q2 = 7,5 Q3 = 11
Interquartile = Q3 – Q1
range
Five-point It consists of the following values in the data set
summary 1. Minimum value
2. Q1
3. Q2 (Median)
4. Q3
5. Maximum value
E.g. 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
30
The position of the 30th percentile: (n + 1)
100
(n = number of data in the data set)
30
(20 + 1) = 6,3
100
6
GROWTH CHATS
Provides an indication of the typical weight, age and height growth patterns of
children and babies.
The concept of percentiles is used in growth charts.
The curves on the growth chart below represents the percentile values of the data
collected from different age groups.
The growth chart is used to compare the BMI (body mass index) of a child to others
in his age group.
This is also used to determine the health status of the baby.
EXAMPLES
1. What is the BMI of a 4 year old girl at the 95th percentile?
2. The couple’s 10 year old child has a BMI of 16 kg/m². Between which percentile
curve does her BMI lie?
Solutions:
1. Draw a vertical line upward from 4 years to the 95th percentile.
Draw a horizontal line across to find the relevant BMI.
The BMI is 18 kg/m².
7
8
BOX AND WHISKER PLOTS
Box and whisker plots are graphical representation of the five number summary of a
set of data.
The five number summary:
1. Minimum value
2. Lower quartile (𝑄1)
3. Median (𝑄2 )
4. Third quartile (𝑄3 )
5. Maximum value
EXAMPLE
Read from the box and whisker plot the values of the five number summary.
Solution:
Minimum value 70
Lower quartile (Q1) 100
Median (Q2) 110
Third quartile (Q3) 115
Maximum Value 120
9
EXERCISES
QUESTION 1
There is a global increase in the use of communication technology, such as the Internet,
social networks and cellphones. TABLE 1 in ANNEXURE A shows data regarding the
percentage of the world population living in the 12 regions as well as the percentage of
people using different means of communication.
1.2 Write down the modal percentage usage for cellphone communication. (3)
1.3 Calculate the median percentage usage for Internet communication. (3)
1.4 Write down the total percentage of Internet usage in America. (2)
1.5 Determine the total percentage of the world population living in all of Asia. (3)
1.6 Write down the global region that shows the greatest difference between the
percentage usage of Internet communication and the percentage usage of
cellphone communication. (2)
[15]
10
ANNEXURE A
QUESTION 1
World Social
Global regions Internet Cellphone
population network
usage usage
(%) usage
A CENTRAL ASIA 2 1 1 2
B OCEANIA --- 1 1 1
CENTRAL
C 3 3 3 3
AMERICA
D MIDDLE EAST 4 4 3 5
E SOUTH-EAST ASIA 9 6 8 10
CENTRAL AND
F 4,5 7 6 7
EASTERN EUROPE
G SOUTH AMERICA 6 8 10 8
H AFRICA … 8 4 11
I SOUTH ASIA 23 8 6 18
WESTERN
K 5,5 13 10 8
EUROPE
L EAST ASIA 22 30 37 22
11
QUESTION 2
2.1 Determine the total number of people living in rural areas. (3)
12
QUESTION 3
The population of South Africa, per province, gender and population group for 2016 is
shown on TABLE 3 on ANNEXURE B.
3.1 Which province has the most black, male persons and how many are they? (3)
3.2 Which ONE of the following represents the total number of coloured people in
South Africa in 2016?
3.3 Identify the population group and provinces that have the exact same number
of male and female persons. (2)
3.6 Express the number of Asian female persons in Gauteng to the total number of
persons in Gauteng as a ratio in the form 1 : ... (3)
[15]
QUESTION 4
A box and whiskers plot is given below, as well as terms that describe the different letters
on the diagram.
A B C D E
A
TERMS:
Median ; Maximum ; Quartile 3 ; Minimum ; Quartile 1
4.1 Provide labels for the box and whiskers plot by matching the terms with the
letters shown on the diagram. Write ONLY the letter and correct term. (5)
13
ANNEXURE B
QUESTION 3
POPULATION OF SOUTH AFRICA, PER PROVINCE, GENDER AND POPULATION GROUP FOR 2016
Thousands
Black Coloured Asian White Total
Province Male Female Total Male Female Total Male Female Total Male Female Total Male Female Total
Western Cape 1 062 1 057 2 118 1 523 1 636 3 159 18 19 36 525 524 1 049 3 127 3 236 6 362
Eastern Cape 2 852 3 117 5 969 253 283 536 7 4 11 101 114 215 3 213 3 518 6 731
Northern Cape 312 333 645 235 239 474 2 - 2 34 37 71 583 609 1 192
Freestate 1 146 1 275 2 420 53 45 98 8 4 12 103 136 239 1 310 1 459 2 769
Kwazulu-Natal 4 647 5 013 9 660 56 56 112 362 410 772 127 135 262 5 192 5 614 10 807
North West 1 744 1 723 3 467 20 24 45 9 10 19 104 124 228 1 877 1 881 3 758
Gauteng 5 335 5 175 10 511 210 225 436 254 212 466 1 034 1 096 2 130 6 834 6 709 13 543
Mpumalanga 1 966 2 053 4 019 9 6 15 9 9 18 116 122 238 2 100 2 190 4 290
Limpopo 2 643 2 902 5 537 13 18 32 32 16 48 59 50 109 2 739 2 986 5 724
South Africa 21 698 22 648 44 346 2 373 2 533 4 906 700 684 1 384 2 203 A 4 540 26 974 28 202 55 176
[Adapted by www.statssa.gov.za]
14
QUESTION 5
The number of learners, teachers and schools in the school sector of South Africa is
indicated per province for 2016 in TABLE 4.
Use TABLE 4 and the information above to answer the questions that follow.
5.1 Which province had the most learners in private schools in 2016? (2)
5.2 Which provinces have less than the mean number of teachers per province for
public schools? (4)
5.3 Determine the median value of teachers per province for private schools. (2)
5.4 Calculate the range for the number of learners in public schools for all nine
provinces. (2)
[10]
15
QUESTION 6
6.1 In 2016 and 2017 a group of friends decided to take part in the Cape Argus Pick-n-Pay
Cycle Tour as a team.
TABLE 5 below summarizes the times in which each member of the team completed the
tour in 2016 and 2017.
Jackson 26 05:33:43
Janda 29 06:11:59
Use the information in the above TABEL 5 to answer the following questions:
6.1.1 Write down the total number of members belonging to the team in
2017. (2)
6.1.2 Give the names of the members who were NOT part of the team in
2016. (2)
6.1.3 Determine the modal age of the 2017 club members. (2)
16
6.2 The 2016 and 2017 times for the team rounded to the nearest minute, are shown below.
6.2.1 Is the data above discrete or continuous? Motivate your answer. (3)
6.2.2 Calculate the mean time for 2017. Give your answer in hours and minutes. (5)
6.2.5 Which rider improved the most from 2016 to 2017 and by how many minutes? (3)
[24]
Discrete data
17
Single bar A bar graph is used to
graphs represent data that is
sorted into categories.
Display data is compared
in categories. Each bar
shows the number of
items in that category
Multiple and there are spaces
(double) bar between the bars.
graphs
Compound
(stacked) bar
graphs
18
Scatter plots A scatter plot is the most
useful graph for studying
the relationship
(correlation) between
two variables.
SCATTER PLOT
A scatter plot is the most useful graph for studying the relationship (correlation) between two
variables. It shows one of the variables on the horizontal axis and the other variable on the
vertical axis. The resulting scatter plot of points will show at a glance whether a relationship
exists. You cannot have more than two sets of data on a scatter plot.
A scatter plot can show:
• positive correlation
• negative correlation
• no correlation.
• When seeing patterns remember that the tighter together the points are clustered, the stronger
the correlation between the variables you have plotted.
• If you find a pattern that slopes from the lower left to the upper right, this tells you that as x
increases, y also increases. This means there is a “positive” correlation between the two
variables.
• If you find a pattern that slopes from the upper left to the lower right, this tells you that as x
increases, y decreases. This means there is a “negative” correlation between the two variables.
19
QUESTION 7
7.1 Two broken-line graphs representing some of the data in TABLE 1 (Question 1)
have been drawn on the grid on the ANSWER SHEET.
Draw another broken-line graph on the same grid to represent the percentage
cellphone usage for all the global regions on the ANSWER SHEET. (6)
7.2 Use the information in Question 6.2 and the graph on the ANSWER SHEET
showing the times for the riders for both 2016 and 2017. Complete the graph for
the missing data. (6)
[12]
QUESTION 7.1
20
PERCENTAGES
10
0
A B C D E F G H I J K L
GLOBAL REGIONS
20
QUESTION 7.2
330
310
290
270
Time in minutes
250
230
210
190
170
150
John Sibu Mike Tumi Cole Joe Pete Ed Stew Piet
Names of Riders
2016 2017
21
QUESTION 8
8.1 In a national science olympiad the rules state that each school may enter a maximum
of three learners (participants). TABLE 6 below shows the relationship between the
number of schools entering and the maximum number of participants.
8.1.2 Each school must have ONE teacher who invigilates the writing of the
olympiad. Calculate the number of schools that entered the olympiad if a
total of 32 712 people were involved on the day the olympiad was written. (3)
8.2 Matuli, Bianca and Khotso wrote some practice tests at their school. Their percentage
marks are given in the table below.
8.2.3 The box and whisker diagram below represents the spread of Khotso's
percentage marks.
30 40 50 60 70 80 90
22
8.2.4 Bianca stated that Matuli performed better than she did in the practice
tests.
Give TWO possible reasons to support Bianca's statement. (4)
[14]
QUESTION 9
[Percentage occupancy is the percentage of all rental units that are rented out at a given
time.]
9.1 The average daily rate in Kula remained almost the same from 2011 to 2014.
Explain your observations regarding the percentage occupancy in Kula during the
same period. (4)
9.2 Compare the relationship between the average daily rates and the percentage
occupancy in Ubud for the year to date (YTD) Sep. 2014 to YTD Sep 2015. (4)
9.3 Explain why both graphs have a gap between 2014 and YTD September 2014. (4)
[12]
23
ANNEXURE C
QUESTION 9
AVERAGE DAILY RATES AND OCCUPANCY FOR DIFFERENT REGIONS FROM 2010 TO SEP. 2015
400 85
350 80
Average daily rate in USD
300
Percentage Occupancy
75
250
70
200
65
150
60
100
50 55
0 50
2010 2011 2012 2013 2014 YTD YTD 2010 2011 2012 2013 2014 YTD YTD
Sep Sep Sep Sep
2014 2015 2014 2015
24
Questions 10 and 11 must be done according to the time allocated. Do not look at past
questions or answers. Answer it as if you are busy writing a test.
Study the correct solutions to the questions you had wrong. Work through it again.
10.1 TABLE 8 shows the types of voting stations (VSs) used during the 2016 local
government elections in South Africa.
10.1.2 State the province which has the most voting stations. (2)
10.1.3 Determine the mean number of voting stations (VSs) in South Africa. (3)
10.1.4 Write down the modal number of mobile voting stations in South
Africa. (2)
10.1.7 The bar graph on the ANSWER SHEET shows the total number of voting
stations.
On the same ANSWER SHEET, the first three bars are drawn showing
the permanent voting stations.
Fill in the remaining bar graphs showing the permanent voting stations. (6)
25
10.2 The TWO pie charts below show why and how people in South Africa travel.
Study the TWO pie charts above and answer the questions that follow.
10.2.1 Calculate the percentage of people whose reason for travel is sport. (2)
Calculate the number of people who travel to visit family and friends. (2)
[26]
26
ANSWER SHEET
QUESTION 10.1.7
Types of voting stations used during the 2016 local government elections
5000
4500
4000
3500
Number of voting stations
3000
2500
2000
1500
1000
500
0
Free State
Gauteng
Mpumalanga
Western Cape
Eastern Cape
Kwazulu-Natal
Limpopo
North West
Northern Cape
27
QUESTION 11 – you get 40 minutesto answer this question.
11.1 According to the SARS data for December 2017, South Africa's 148 266 millionaires
earn between R1 million and R2 million per annum.
The number of millionaires increased by 5,0065% compared to the previous year. The
total annual taxable income for ALL the millionaires was R287,24 billion.
[Source: SARS Statistics, released December 2017]
11.1.1 It was stated that the mean monthly income per millionaire is exactly
R161 000.
11.2 TABLE 9 on ANNEXURE D shows the top marginal tax rate for individuals in the
G20 countries. This table provides present and past data of the top marginal tax rates. It
was updated in January 2019.
11.2.1 Name the country that has the biggest range between 2019 and the past
top marginal tax rates. (2)
11.2.2 Use the 2019 top marginal tax rate and answer the following questions:
28
11.3 The Republic of South Africa (RSA) conducts household censuses to collect information.
The next census will take place in 2021.
Census information regarding household size is shown below.
HOUSEHOLD SIZE
HOUSEHOLD CENSUS 1996 CENSUS 2001 CENSUS 2011
SIZE
Five or
more 36% 33% 25%
11.3.2 State which household size matches EACH of the following trends:
11.3.3 It was stated that the percentage of households with five or more persons
decreased from 2001 to 2011, therefore the number of households with
five or more persons decreased by 0,060 million.
Verify, showing ALL calculations, whether this statement is CORRECT. (5)
11.3.4 Explain why the percentages for the 1996 census do not add up to 100%. (2)
[33]
29
ANNEXURE D
QUESTION 11.2
UNIT %
COUNTRY 2019 PREVIOUS
Japan 55,95 55,95
Netherlands 52,00 52
Germany 47,50 47,5
Australia 45,00 45
China 45,00 45
France 45,00 45
South Africa 45,00 45
Spain 45,00 45
United Kingdom 45,00 45
Italy 43,00 43
South Korea 40,00 40
Switzerland 40,00 40
United States of America 37,00 39,6
India 35,88 35,54
Argentina 35,00 35
Mexico 35,00 35
Turkey 35,00 35
Canada 33,00 33
Indonesia 30,00 30
Brazil 27,50 27,5
Singapore 22,00 22
Russia 13,00 13
Saudi Arabia 0,00 0
30