C 13 Bi Variate Data
C 13 Bi Variate Data
LEARNING SEQUENCE
13.1 Overview ...............................................................................................................................................................830
13.2 Bivariate data ......................................................................................................................................................834
13.3 Lines of best fit by eye .................................................................................................................................... 844
13.4 Linear regression using technology (10A) ................................................................................................855
13.5 Time series .......................................................................................................................................................... 863
13.6 Review ................................................................................................................................................................... 871
13.1 Overview
Why learn this?
Bivariate data can be collected from all kinds of place. This includes
data about the weather, data about athletic performance and data
about the profitability of a business. By learning the tools you need
to analyse bivariate data, you will be gaining skills that help you turn
numbers (data) into powerful information that can be used to make
predictions (plots).
The use for bivariate data is not limited to the classroom; in fact, many
professionals rely on bivariate data to help make decisions. Some
examples of bivariate data in the real world are:
• when a new drug is created, scientists will run drug trials in
which they collect bivariate data about how the drug works.
When the drug is approved for use, the results of the scientific
analysis help guide doctors, nurses, pharmacists and patients as
to how much of the drug to use and how often.
• manufacturers of products can use bivariate data about sales to
help make decisions about when to make products and how many
to make. For example, a beach towel manufacturer would know
that they need to produce more towels for summer, and analysis
of the data would help them decide how many to make.
By studying bivariate data you can learn how to use data to make predictions. By studying and understanding
how these predictions work, you will be able to understand the strengths and limitations of these types of
predictions.
Fully worked
Video Interactivities
solutions
eLessons
to every
question
Digital
eWorkbook
documents
3. Data is compared from twenty students on the number of hours spent studying for an examination and
the result of the examination. State if the number of hours spent studying is the independent or
dependent variable.
4. Match the type of correlation with the data shown on the scatter plots.
b. y B. No correlation
Task score %
Number of hours spent on task 0 1.5 2 1 2 1.5 2.5 3 2 2.5
20 50 60 45 80 70 75 97 85 20
Choose which data point is a possible outlier.
A. (0, 20) B. (1.5, 50) C. (1.5, 70) D. (2.5, 20) E. (2.5, 70)
6. MC Each point on the scatterplot shows the Exercising and fitness levels
number of hours per week spent exercising by a y
person and their fitness level. 3.5
Fitness levels
B. The number of hours per week spent exercising 2
is the independent variable.
C. The correlation between the number of hours 1.5
per week exercising and the fitness levels is a
weak positive non-linear correlation. 1
D. There are six people’s information collected.
E. There is an outlier. 0.5
0 1 2 3 4 5 6 7 8 9 10 11 12 x
10. Select another term for an independent variable from the
following options: a response variable or an explanatory
variable.
11. MC Select the correct difference between a seasonal pattern and a cyclical pattern in a time series plot.
A. A cyclical pattern shows upward trends, where as a seasonal pattern shows only downward trends.
B. A cyclical pattern displays fluctuations with no regular periods between peaks, where as a seasonal
pattern displays fluctuations that repeat at the same time each week, month, quarter or year.
C. A cyclical pattern does not show any regular fluctuations, where as a seasonal pattern does.
D. A seasonal pattern displays fluctuations with no regular periods between peaks, where as a cyclical
pattern displays fluctuations that repeat at the same time each week, month, quarter or year.
E. A seasonal pattern shows upward trends, where as a cyclical pattern shows only downward trends.
y
10
6
Data
0 x
10 20 30 40 50
Time
13. Use the given scatterplot and the line of best fit to determine the value of x when y = 2.
y
7
0 1 2 3 4 5 6 7 x
Scatter plots
• A scatter plot is a way of displaying bivariate data.
• A scatter plot will have:
• the independent variable placed on the x-axis with a label and scale
• the dependent variable placed on the y-axis with a label and scale
• the data points shown on the plot.
25
0 1 2 3 4 x
Time spent on phone per day (hours) Independent data
on x-axis
The table shows the total revenue from selling tickets for a number of different chamber music
concerts. Represent the given data on a scatterplot.
Number of
400 200 450 350 250 300 500 400 350 250
tickets sold
revenue $
Total ( )
8000 3600 8500 7700 5800 6000 11 000 7500 6600 5600
THINK WRITE/DRAW
1. Determine which is the dependent variable The total revenue depends on the number of tickets
and which is the independent variable. being sold, so the total revenue is the dependent
variable and the number of tickets in the independent
variable.
9000
3. Use an appropriate scale on the horizontal and
8000
vertical axes.
7000
4. Plot the points on the scatterplot. 6000
5000
4000
3000
0 200 250 300 350 400 450 500
Number of tickets sold
Describing correlation
Linear Non-linear
Positive Negative
• Data will have no correlation if the data are spread out across the plot with no clear pattern, as shown
in this example.
No correlation
State the type of correlation between the variables x and y, shown on the scatterplot.
THINK WRITE
Carefully analyse the scatterplot and comment The points on the scatterplot are close together
on its form, direction and strength. and constantly increasing therefore the relationship
is linear.
The path is directed from the bottom left corner to
the top right corner and the value of y increases as x
increases. Therefore the correlation is positive.
The points are close together so the correlation can
be classified as strong.
There is a linear, positive and strong relationship
between x and y.
The number of hours spent on a phone per day appears to affect the average marks .
This means that the more time spent on a phone per day, the worse a student’s marks are likely to be.
Explanation
Mary sells business shirts in a department store. She always records the number of different styles of
shirt sold during the day. The table below shows her sales over one week.
Price $
( )
14 18 20 21 24 25 28 30 32 35
Number of shirts sold 21 22 18 19 17 17 15 16 14 11
THINK WRITE/DRAW
a. Draw the scatterplot showing ‘Price ($)’ a.
(independent variable) on the horizontal 28
DISCUSSION
How could you determine whether the change in one variable causes the change in another variable?
Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Digital documents SkillSHEET Substitution into a linear rule (doc-5405)
SkillSHEET Solving linear equations that arise when finding x- and y-intercepts (doc-5406)
SkillSHEET Transposing linear equations to standard form (doc-5407)
SkillSHEET Measuring the rise and the run (doc-5408)
SkillSHEET Determining the gradient given two points (doc-5409)
SkillSHEET Graphing linear equations using the x- and y-intercept method (doc-5410)
SkillSHEET Determining independent and dependent variables (doc-5411)
SkillSHEET Determining the type of correlation (doc-5413)
Interactivity Individual pathway interactivity: Bivariate data (int-4626)
To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.
Fluency
For questions 1 and 2, decide which of the variables is
independent and which is dependent.
1. a. Number of hours spent studying for a Mathematics test and
the score on that test.
b. Daily amount of rainfall (in mm) and daily attendance at the
Botanical Gardens.
c. Number of hours per week spent in a gym and the annual
number of visits to the doctor.
d. The amount of computer memory taken by an essay and the
length of the essay (in words).
2. a. The cost of care in a childcare centre and attendance at the childcare centre.
b. The cost of the property (real estate) and the age of the property.
c. The entry requirements for a certain tertiary course and the number of applications for that course.
d. The heart rate of a runner and the running speed.
3. WE1 The following table shows the cost of a wedding reception at 10 different venues. Represent the data
on a scatterplot.
4. WE2 State the type of relationship between x and y for each of the following scatterplots.
a. y b. y c. y
x x x
d. y e. y
x x
x x x
d. y e. y
x x
6. State the type of relationship between x and y for each of the following scatterplots.
a. y b. y c. y
x x x
d. y e. y
x x
Understanding
7. WE3 Eugene is selling leather bags at the local market. During the day he keeps records of his sales. The
table below shows the number of bags sold over one weekend and their corresponding prices (to the
nearest dollar).
8. The table below shows the number of bedrooms and the price of each of 30 houses.
9. The table below shows the number of questions solved by each student on a test, and the corresponding total
score on that test.
Number of questions 2 0 7 10 5 2 6 3 9 4 8 3 6
Total score (%) 22 39 69 100 56 18 60 36 87 45 84 32 63
10. A sample of 25 drivers who had obtained a full licence within the last month
was asked to recall the approximate number of driving lessons they had taken
(to the nearest 5), and the number of accidents they had while being on P plates.
The results are summarised in the table that follows.
a. Represent these data on a scatterplot.
b. Specify the relationship suggested by the scatterplot.
c. Suggest some reasons why this scatterplot is not perfectly linear.
Number of lessons 5 20 15 25 10 35 5 15 10 20 40 25 10
Number of accidents 6 2 3 3 4 0 5 1 3 1 2 2 5
Number of lessons 5 20 40 25 30 15 35 5 30 15 20 10
Number of accidents 5 3 0 4 1 4 1 4 0 2 3 4
Reasoning
11. MC The scatterplot that best represents the relationship between the amount of water consumed daily by a
certain household for a number of days in summer and the daily temperature is:
A. B. C.
Temperature (°C)
D. E.
Temperature (°C)
Water usage (L)
Problem solving
14. The table below gives the number of kicks and handballs obtained by the top 8 players in an AFL game.
Player A B C D E F G H
Number of kicks 20 27 21 19 17 18 21 22
Number of handballs 11 3 11 6 5 1 9 7
15. Each point on the scatterplot shows the time (in weeks) spent by a person on a healthy diet and the
corresponding mass lost (in kg).
Study the scatterplot and state whether each of the following statements is true or false.
a. The number of weeks that the person stays on a diet is the independent
variable.
Loss in mass (kg)
Test 2
b. William, who got the top mark in test 1 but not in
(i) (viii)
test 2.
c. Charlotte, who did better on test 1 than Mardi but not
(vii)
as well in test 2.
d. Dario, who did not do as well as Charlotte in
both tests.
e. Edward, who got the same mark as Mardi in test 2
Test 1
but did not do so well in test 1.
f. Cindy, who got the same mark as Mardi for test 1 but did better than her for test 2.
g. Georgina, who was the lowest in test 1.
h. Harrison, who had the greatest discrepancy between his two marks.
eles-4968
• A line of best fit is a line that follows the trend of the data in a 100
scatter plot.
• A line of best fit is most appropriate for data with strong or 75
moderate linear correlation.
• Drawing lines of best fit by eye is done by placing a line that: 50
• represents the data trend
25
• has an equal number of points above and below the line.
0 1 2 3 4
Time spent on phone per day (hours)
To determine the equation of the line of best fit, follow these steps:
1. Choose two points on the line.
y2 − y1
3. Calculate the gradient using m =
x2 − x1
.
The data in the table shows the cost of using the internet at a number of different internet cafes based
on hours used per month.
THINK WRITE/DRAW
a. Draw the scatterplot placing the independent a. y
variable (hours used per month) on the 32
horizontal axis and the dependent variable 30
c. 1. Select two points on the line that are not too c. Let (x1 , y1 ) = (13, 20) and (x2 , y2 ) = (20, 30).
y2 − y1
close to each other.
m=
x2 − x1
2. Calculate the gradient of the line.
30 − 20
m=
20 − 13
=
10
y = mx + c
7
3. Write the rule for the equation of a straight line.
20 = (13) + c
10
5. Substitute one pair of coordinates,
7
c = 20 −
say (13, 20) into the equation to evaluate c.
130
140 − 130
7
=
7
=
10
7
y= x+
10 10
6. Write the equation.
Note: The values of c and m are the same in 7 7
this example. This is not always the case.
C= n+
10 10
7. Replace x with n (number of hours used) and
y with C (the total monthly cost) as required. 7 7
Interpolation vs extrapolation
Predictions made within
the data use interpolation.
y
Average student marks (%)
100
75
0 1 2 3 4 5
Time spent on phone per day (hours)
0 5 10 15 20 25 30 35 40 x
THINK WRITE/DRAW
a. 1. Locate 10 on the x-axis and draw a vertical a. y
line until it meets with the line of best fit. 45
From that point, draw a horizontal line to the 40
35
y-axis. Read the value of y indicated by the
30
horizontal line.
25
20
15
10
5
The table below shows the number of boxes of tissues purchased by hay fever sufferers and the
number of days affected by hay fever during the blooming season in spring.
THINK WRITE/DRAW
a. 1. Draw the scatterplot showing the a. T
0 3 4 5 6 7 8 9 10 11 12 13 14 d
No. days affected by hay fever
1
(3,1)
0 3 4 5 6 7 8 9 10 11 12 13 14 d
No. days affected by hay fever
b. 1. Select two points on the line that are not too b. Let (x1 , y1 ) = (3, 1) and (x2 , y2 ) = (14, 5).
y2 − y1
close to each other.
m=
x2 − x1
2. Calculate the gradient of the line.
5−1
m= =
14 − 3 11
4
y= x+c
4
4. Substitute the known values into the
11
1= (3) + c
equation, say (3, 1), into the equation to
calculate c. 4
11
c = 1−
12
−1
11
=
11
y= x−
4 1
5. Write the equation.
11 11
T= d−
4 1
6. Replace x with d (number of days with hay
fever) and y with T (total number of boxes of 11 11
tissues used) as required.
c. Interpret the meaning of the gradient of the c. The gradient indicates an increase in sales
line of best fit. of tissues as the number of days affected
by hay fever increases. A hay fever
4
sufferer is using on average (or
11
about 0.36) of a box of tissues per day.
T= × 11 −
equation and evaluate. 4 1
11 11
= 4−
1
11
=3
10
11
2. Interpret and write the answer. In 11 days the hay fever sufferer will need
4 boxes of tissues.
ii. 1. Substitute the value d = 15 into the ii. When d = 15,
T= × 15 −
equation and evaluate. 4 1
11 11
= −
60 1
11 11
=5
4
11
2. Interpret and write the answer. In 15 days the hay fever sufferer will need
about 6 boxes of tissues.
DISCUSSION
Why is extrapolation not considered to be reliable?
To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.
Fluency
1. WE4 The data in the table shows the distances travelled by 10 cars
and the amount of petrol used for their journeys (to the nearest
litre).
Distance travelled
52 36 83 12 44 67 74 23 56 95
(km), d
Petrol used (L), P 7 5 9 2 7 9 12 3 8 14
a. Construct a scatterplot of the data and draw the line of best fit.
b. Determine the equation of the line of best fit in terms of the
variables d (distance travelled) and P (petrol used).
2. WE5 Use the given scatterplot and line of best fit to predict:
y
70
60
50
40
30
20
10
0 10 20 30 40 50 60 70 80 90 x
4. A sample of ten Year 10 students who have part-time jobs was randomly selected. Each student was asked
to state the average number of hours they work per week and their average weekly earnings (to the nearest
dollar). The results are summarised in the table below.
Hours worked, h 4 8 15 18 10 5 12 16 14 6
Weekly earnings ($), E 23 47 93 122 56 33 74 110 78 35
a. Construct a scatterplot of the data and draw the line of best fit.
b. Write the equation of the line of best fit, in terms of variables h (hours worked) and E (weekly earnings).
c. Interpret the meaning of the gradient.
Understanding
5. WE6 The following table shows the average weekly expenditure on food for households of various sizes.
a. Construct a scatterplot of the data and draw in the line of best fit.
b. Determine the equation of the line of best fit. Write it in terms of
variables n (for the number of people in a household) and C (weekly
cost of food).
c. Interpret the meaning of the gradient.
d. Use the equation of the line of best fit to predict the weekly food
expenditure for a family of:
i. 8 ii. 9 iii. 10.
6. The number of hours spent studying, and the percentage marks obtained by a group of students on a test are
shown in this table.
Hours spent studying 45 30 90 60 105 65 90 80 55 75
Marks obtained 40 35 75 65 90 50 90 80 45 65
a. State the values for marks obtained that can be used for interpolation.
b. State the values for hours spent studying that can be used for interpolation.
a. Construct a scatterplot of the data. Suggest the type of correlation shown by the scatterplot.
b. Draw in the line of best fit and determine its equation. Write it in terms of the variables t (gestation time)
and M (birth mass).
c. Determine what the value of the gradient represents.
d. Although full term of gestation is considered to be 40 weeks, some pregnancies last longer. Use the
equation obtained in part b to predict the birth mass of babies born after 41 and 42 weeks of gestation.
e. Many babies are born prematurely. Using the equation obtained in part b, predict the birth mass of a baby
whose gestation time was 30 weeks.
f. Calculate their gestation time (to the nearest week), if the birth mass of the baby was 2.390 kg.
Reasoning
8. MC Consider the scatterplot shown.
y
x
0 10 20 30 40 50 60 70
The line of best fit on the scatterplot is used to predict the values of y when x = 15, x = 40 and x = 60.
a. Interpolation would be used to predict the value of y when the value of x is:
D. x = 40 and x = 60 E. x = 60
9. MC The scatterplot below is used to predict the value of y when x = 300. This prediction is:
y
500
400
300
200
100
0 x
100 200 300 400 500 600 700
Day number 1 2 3 4 5 8 9 10 11 12 15 16
Mass (g) 2.5 3.7 4.2 5.0 6.1 8.4 9.9 11.2 11.6 12.8 16.1 17.3
Measurements on days 6, 7, 13 and 14 are missing, since these were 2 consecutive weekends and, hence,
Rachel did not have a chance to measure her crystal, which is kept in the school laboratory.
a. Construct the scatterplot of the data and draw in the line of best fit.
b. Determine the equation of the line of best fit. Write the equation,
using variables d (day of the experiment) and M (mass of the crystal).
c. Interpret the meaning of the gradient.
d. For her report, Rachel would like to fill in the missing measurements
(that is, the mass of the crystal on days 6, 7, 13 and 14). Use the
equation of the line of best fit to help Rachel determine these
measurements. Explain whether this is an example of interpolation
or extrapolation.
e. Rachel needed to continue her experiment for 2 more days, but she
fell ill and had to miss school. Help Rachel to predict the mass of the
crystal on those two days (that is, days 17 and 18), using the equation
of the line of best fit. Explain whether these predictions are reliable.
Problem solving
11. Ari was given a baby rabbit for his birthday. To monitor the rabbit’s growth, Ari decided to measure it once
a week.
The table below shows the length of the rabbit for various weeks.
Week number, n 1 2 3 4 6 8 10 13 14 17 20
Length (cm), l 20 21 23 24 25 30 32 35 36 37 39
an A+ .
Explain whether it is possible for him to receive
The following data shows the amount of time (hours) and the amount of distance walked (km) on a
bushwalk. Display the data on a scatter plot using technology.
THINK WRITE
1. Determine which data will go Time is the independent variable – it will go in the first column.
in which column by identifying Distance is the dependent variable – it will go in the second column.
the independent and dependent
variable.
2. Input the data into the
spreadsheet or calculator.
The following data shows the amount of time (hours) and the amount of distance walked (km) on a
bushwalk. Determine the equation of the regression line and display the regression line using
a spreadsheet.
(mx + b)
• 3. Linear Regression
THINK WRITE
a. 1. Use the spreadsheet or CAS pages set up in See Worked example 8.
Worked example 8.
Distance = 4.93 km
For a spreadsheet use the function Note the order B2:B5 first and A2:A5 second.
FORECAST.
OR OR
For a CAS start on the graph page and use the Using a CAS select Trace, type the value 2.2
Trace tool. and press enter.
Distance = 4.93 km
Note: the scatter plot must be turned off.
Distance = 11.07 km
For a spreadsheet use the function Note the order B2:B5 first and A2:A5 second.
FORECAST
OR OR
For a CAS start on the graph page and use the Using a CAS select Trace, type the value 6.4
Trace tool. and press enter.
Distance = 11.07 km
Note: the scatter plot must be turned off.
Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.
Fluency
1. WE 7 The following data shows the amount of time athletes spent training in preparation for a marathon and
their finishing position in the race. Display the data on a scatter plot using technology.
2. WE 8 The following data shows the number of visitors to a store in a day and the profit of the store that day.
Determine the equation of the regression line and display the regression line using either a spreadsheet or a
CAS calculator.
Number of visitors (x) 80 85 94 101
Profit, dollars (y) 152 164 180 200
4. The following data shows how far away students live from school in kilometers and the hours those students
spend in a car per week.
5. The following data shows how many thousands of bees are in a hive and the amount of honey produced in
that hive per year:
Understanding
6. The following data shows the temperature on certain days of the year. The days have been numbered like
this: 1 January is 1, 2 January is 2 and so on for 365 days. Assume it is a non-leap year.
7. Chantal is a big fan of the Dugongs baseball team. The following data shows the number of games Chantal
watched per year and the games won by the Dugongs per year.
9. MC Sally Miles is a world-famous pop star. By analysing Sally’s tour data, the equation for a regression line
y = 22x + 25144. In the equation of the regression line, the number 22 represents:
is found that relates numbers of fans at a concert (x) to Sally’s earning (y). The regression line equation is
dependent variable. Select which of the following would NOT have a valid regression line.
A. Number of growing days and height of a sunflower.
B. Average top speed of cars and years since cars were invented.
C. Number of ice creams purchased and the price of ice cream.
D. Amount of cheese eaten per capita and the number of injuries in an AFL season.
E. Amount of cheese eaten per capita and the price of cheese.
Reasoning
11. Two friends, Yousef and Gavin, were having an eating competition. In the competition they both ate one,
then two, then three apples and recorded their time. These were the results:
Yousef
Number of apples eaten (x) 1 2 3
Time taken, seconds (y) 46 67 124
Gavin
Number of apples eaten (x) 1 2 3
Time taken, seconds (y) 38 75 112
a. Use technology to determine the equations of the regression lines for each set of data.
b. Identify the gradients for each set of data.
c. Compare the gradients. Explain what this information shows.
d. Explain who won the apple-eating contest using the data.
12. The following data shows the time it took for five people to complete one lap of a BMX track.
Problem solving
14. At the school athletics carnival Mr. Wall was in charge of recording the student year levels and jump heights
for the winning jumps.
Mr Wall knows the regression line for this data is y = 4.91x + 79.3. Calculate the missing jump height.
x 1 2 3 4
y 3 5 9 11
b. Now determine four different data points that still give a regression line equation of y = 8x + 3.
32.5
Temp. (°C)
32.0 100
31.5
31.0 50
30.5
30.0
0 1 2 3 4 5 6 7 8 9 10 x 0 2 4 6 8 10 12 14 16 x
Time (hours) Time (days)
12
10
8
6
4
2
0 x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2003 2004 2005
Time
(continued)
products sold
300
Software
250
200
150
100
50
0
x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2003 2004 2005
Time
Random No regular pattern and caused by
unpredictable events. 30
Profits
26
22
18
14
0 2 4 6 8 10 12 14 16 x
Time
Classify the trend suggested by the time series graph shown as being linear or non-linear, and
upward, downward or no trend.
Data
Time
THINK WRITE
Carefully analyse the given graph and The time series graph does not resemble a straight
comment on whether the graph resembles line and overall the level of the variable, y, decreases
a straight line or not and whether the over time. The time series graph suggests a non-linear
values of y increase or decrease over time. downward trend.
The data below show the average daily mass of a person (to the nearest 100 g), recorded over the
28-day period.
63.6, 63.8, 63.5, 63.7, 63.2, 63.0, 62.8, 63.3, 63.1, 62.7, 62.6, 62.5, 62.9, 63.0,
63.1, 62.9, 62.6, 62.8, 63.0, 62.6, 62.5, 62.1, 61.8, 62.2, 62.0, 61.7, 61.5, 61.2
a. Plot these masses as a time series graph.
b. Comment on the trend.
Mass (kg)
62.6
62.4
62.2
62.0
61.8
61.6
61.4
61.2
61.0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 x
Time (days)
b. Carefully analyse the given graph and b. The graph resembles a straight line that slopes
comment on whether the graph resembles a downwards from left to right (that is, mass
straight line or not and whether the values decreases with increase in time). Although a
of y (in this case, mass) increase or decrease person’s mass fluctuates daily, the time series
over time. graph suggests a downward trend. That is,
overall, the person’s mass has decreased over
the 28-day period.
260
the cost of rent in 5 years’ time; that 240
is, in the 15th year. 220
2. Locate the 15th year on the time 200
axis and draw a vertical line until it 180
meets with the line of best fit. From 160
the trend line (line of best fit) draw a 140
horizontal line to the cost axis. 0 1 5 10 15 x
Time (years)
3. Read the cost from the vertical axis. Cost of rent = $260
4. Write the answer. Assuming that the cost of rent will continue to
DISCUSSION
Why are predictions in the future appropriate for time series even though they involve extrapolation?
Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Video eLesson Fluctuations and cycles (eles-0181)
Interactivity Individual pathway interactivity: Time series (int-4628)
To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.
Fluency
1. WE10 For questions 1 and 2, classify the trend suggested by each time series graph as being linear or
non-linear, and upward, downward or stationary in the mean (no trend).
a. Data b. Data
Time Time
c. Data d. Data
Time Time
2. a. Data b. Data
Time Time
c. Data d. Data
Time Time
3. WE11 The data below show the average daily temperatures recorded in June.
17.6, 17.4, 18.0, 17.2, 17.5, 16.9, 16.3, 17.1, 16.9, 16.2, 16.0, 16.6, 16.1, 15.4, 15.1,
15.5, 16.0, 16.0, 15.4, 15.2, 15.0, 15.5, 15.1, 14.8, 15.3, 14.9, 14.6, 14.4, 15.0, 14.2
a. Plot these temperatures as a time series graph.
b. Comment on the trend.
5. The table below shows the total monthly revenue (in thousands of dollars) obtained by the owners of a large
reception hall. The revenue comes from rent and catering for various functions over a period of 3 years.
Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
2007 60 65 40 45 40 50 45 50 55 50 55 70
2008 70 65 60 65 55 60 60 65 70 75 80 85
2009 80 70 65 70 60 65 70 75 80 85 90 100
90
80
70
Camp sites
60
50 Motel rooms
40
30
20
10
a. Describe each graph, discussing general trend, peaks and troughs and so on. Explain particular features
of the graphs and give possible reasons.
b. Compare the two graphs and write a short paragraph commenting on any similarities and differences
between them.
Enrolment
predict the enrolment for the course in 5 years’ time; that is, in 70
the 15th year. 60
50
40
30
20
10
0 1 2 3 4 5 6 7 8 9 10
Time (years)
Reasoning
8. In June a new childcare centre was opened. The number of children attending full time (according to the
enrolment at the beginning of each month) during the first year of operation is shown in the table.
June July Aug. Sept. Oct. Nov. Dec. Jan. Feb. Mar. Apr. May
6 8 7 9 10 9 12 10 11 13 12 14
9. The graph shows the monthly sales of a certain book since its publication. Explain in your own words why
linear trend forecasting of the future sales of this book is not appropriate.
Sales
Time
10. In the world of investing this phrase is commonly used when talking about investments:
“Past performance is not an indicator of future returns.”
a. Explain what this phrase means.
b. Explain why this phrase is true using the term extrapolation.
a. Melita wants to convert the time from minutes into seconds. She starts by converting 1.5 minutes to
150 seconds. Explain what she did wrong and find the correct number in seconds.
b. Copy and complete the table, changing the time in minutes to time in seconds.
c. Draw a scatter plot using seconds as the time scale.
d. Draw a line of best fit and use it to predict the time, in seconds, when the water will reach 20°C.
e. Convert your answer from part e back to minutes.
12. The table below gives the quarterly sales figures for a second-hand car dealer over a three-year period.
Year Q1 Q2 Q3 Q4
2012 75 65 92 99
2013 91 79 115 114
2014 93 85 136 118
a. Represent his sales from Summer 2020/21 to Spring 2022 on a scatter plot.
b. Describe the trend and the patterns in the data.
BIVARIATE DATA
•
Properties of bivariate data Representing the data Correlations from scatter plots
• Bivariate data can be displayed, • Scatter plots can be created • Correlation is a way of
analysed and used to make predictions. by hand or using technology describing a connection between
• Types of variables: (CAS or Excel). variables in a bivariate data set.
• Independent (experimental or • The independent variable is • Correlation between the two
explanatory variable): not placed on the x-axis and the variables will have:
impacted by the other variable. dependent variable on the y-axis. • a type (linear or non-linear)
• Dependent (response variable): • a direction (positive or
impacted by the other variable. Dependent data on y-axis negative)
y • a strength (strong, moderate or
weak).
Time series scatter plots Data points plotted
Average student marks (%)
100
• Correlations can be used to make
75 conclusions.
time is the independent variable. • There is no correlation if the data are
• Describe time series by: 50 spread out across the plot with no
• trends clear pattern.
25
• patterns. Independent data on x-axis
• Time series patterns can be: x
0
• seasonal 1 2 3 4
Time spent on phone per day (hours)
• cyclical
• random.
Interpolation and extrapolation
33.0
• A line that follows the trend of • Interpolation and extrapolation can
Temp. (°C)
32.5
32.0 the data in a scatter plot. be used to make predictions.
31.5
31.0 • It is most appropriate for • Interpolation:
30.5 data with strong or moderate • is more reliable from a large
30.0 t linear correlation. number of data
0 1 2 3 4 5 6 7 8 9 10
Hours • Can be sketched as a line of best • used if the prediction sits within
Cycle peaks every 12 months
the given data.
12 using technology. • Extrapolation:
• The equation for the line can • assumes the trend will continue
Houses sold
10
8 be found by using the gradient • used if the prediction sits outside
6
4 and equation of the straght line. the given data.
2 • The line can be used to make
0 Predictions
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 t predictions. made within
y
2003 2004 2005 • Regression lines are only valid if the data use
Predictions
Average student marks (%)
300 75 extrapolation.
250
200
50
150
Average student marks (%)
100 100
50 25
0 75
Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4 t
0 1 2 3 4 5
2003 2004 2005 50 Time spent on phone per day (hours)
30 25
26
Profits
22
18 0 1 2 3 4
14
0 2 4 6 8 10 12 14 16 18 20 22 24 t Time spent on phone per day (hours)
13.2 I can recognise the independent and dependent variables in bivariate data.
I can describe the correlation between two variables in a bivariate data set.
I can display a scatter plot with its regression line using technology.
13.5 I can describe time series data using trends and patterns.
I can draw a line of best fit by eye and use it to make predictions for a
time series.
13.6.3 Project
Collecting, recording and analysing data over time
Time 8 am 9 am 10 am 11 am 12 pm 1 pm 2 pm 3 pm 4 pm 5 pm
Pulse rate
3. Take your measurements at the selected time intervals and record them in the table.
4. Use your data to plot the time series. You can use software such as Excel or draw the scatterplot by hand.
5. Describe the graph and comment on its trend.
6. If appropriate, draw a line of best fit and predict the next few data values.
7. Take the actual measurements during the hours you have made predictions for. Compare the predictions
with the actual measurements. Were your predictions good? Give reasons.
Here are some suitable subjects for data observation and recording:
• minimum and maximum temperatures each day for 2 weeks (use the TV news or online data as
resources)
• the value of a stock on the share market (e.g. Telstra, Wesfarmers and Rio Tinto)
• your pulse over 12 hours (ask your teacher how to do this or check on the internet)
• the value of sales each day at the school canteen
• the number of students absent each day
• the position of a song in the Top 40 over a number of weeks
• petrol prices each day for 2 weeks
• other measurements (check with your teacher)
• world population statistics over time.
Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Interactivities Crossword (int-2887)
Sudoku puzzle (int-3600)
Fluency
1. As preparation for a Mathematics test, a group of 20 students was given a revision sheet containing 60
questions. The table below shows the number of questions from the revision sheet successfully
completed by each student and the mark, out of 100, of that student on the test.
Number of questions 9 12 37 60 55 40 10 25 50 48 60
Test result 18 21 52 95 100 67 15 50 97 85 89
Number of questions 50 48 35 29 19 44 49 20 16 58 52
Test result 97 85 62 54 30 70 82 37 28 99 80
2. Use the line of best fit shown on the graph to answer the following y
questions. 50
a. Predict the value of y, when the value of x is: 45
40
i. 10 ii. 35.
35
b. Predict the value of x, when the value of y is:
30
i. 15 ii. 30.
25
c. Determine the equation of a line of best fit if it is known that it passes 20
through the points (5, 5) and (20, 27). 15
d. Use the equation of the line to algebraically verify the values 10
obtained from the graph in parts a and b. 5
0 x
5 10 15 20 25 30 35 40
110
b. Explain why it is appropriate to draw in a line
100
of best fit.
90
c. Draw a line of best fit and use it to predict the
80
number of occupants in the nursing home in 3 70
years time. 60
d. State the assumption that have been made when 50
predicting figures for part c. 40
x
19 6
20 9
20 0
20 1
20 2
20 3
20 4
20 5
20 6
20 7
20 8
09
97
19 8
9
9
0
0
0
0
0
0
0
0
0
9
19
19
Time (Year)
land.
( )
Land size m2 Sale price ($'000)
632 36
1560 58
800 40
1190 44
770 41
1250 52
1090 43
1780 75
1740 72
920 43
a. Construct a scatterplot and determine the equation of the line of best fit.
b. State what the gradient represents.
c. Using the line of best fit, predict the approximate sale price, to the nearest thousand dollars for a
block of land with an area of 1600 m2 .
5. The table below shows, for fifteen students, the amount of pocket money they receive and spend at the
school canteen in an average week.
a. Construct a scatterplot and determine the equation of the line of best fit.
b. State what the gradient represents.
receives $100 pocket money each week. Explain if this seems reasonable.
d. Using your line of best fit, predict the amount of money spent at the canteen by a student who
Training (hours) 11 11 2 8 4 16 11 16 5 3
Number of pirouettes 15 13 3 12 7 17 13 16 8 5
7. Use the information in the data table to answer the following questions.
a. Use technology to determine the equation of the line of best fit for the following data.
b. Use technology to predict the value of the number of hours of television watched by a person
aged 15.
Problem solving
8. Describe the trends present in the following time series data that shows the mean monthly daily hours
of sunshine in Melbourne from January to December.
Month 1 2 3 4 5 6 7 8 9 10 11 12
Daily hours of sunshine 8.7 8.0 7.5 6.4 4.8 4.0 4.5 5.5 6.3 7.3 7.5 8.3
9. The existence of the following situations is often considered an obstacle to making estimates from data.
a. Outlier.
b. Extrapolation.
c. Small range of data.
d. Small number of data points.
Explain why each of these situations is considered an obstacle to making estimates of data and how
each might be overcome.
Using the data, estimate the distance a person 1.8 m tall can achieve when attempting the splits. Write a
detailed analysis of your result. Include:
• an explanation of the method(s) used
• any plots or formula generated
• comments on validity of the estimate
• any ways the validity of the estimate could be improved.
To test your understanding and knowledge of this topic, go to your learnON title at
www.jacplus.com.au and complete the post-test.
Below is a full list of rich resources available online for this topic. These resources are designed to bring ideas to life,
to promote deep and lasting learning and to support the different learning needs of each individual.
Solutions
Download a copy of the fully worked solutions to every
question in this topic (sol-0747) ⃞
Digital documents
13.2 SkillSHEET Substitution into a linear rule (doc-5405) ⃞
SkillSHEET Solving linear equations that arise when
finding x- and y-intercepts (doc-5406) ⃞
SkillSHEET Transposing linear equations to standard
form (doc-5407) ⃞
SkillSHEET Measuring the rise and the run (doc-5408) ⃞
SkillSHEET Determining the gradient given two points
(doc-5409) ⃞
SkillSHEET Graphing linear equations using the x- and
y-intercept method (doc-5410) ⃞
SkillSHEET Determining independent and dependent
variables (doc-5411) ⃞
SkillSHEET Determining the type of relationship
(doc-5413) ⃞
Video eLessons
13.3 Bivariate data (eles-4965) ⃞
Correlation (eles-4966) ⃞
Drawing conclusions from correlation (eles-4967) ⃞
13.4 Lines of best fit by eye (eles-4968) ⃞
Predictions using lines of best fit (eles-4969) ⃞
13.5 Scatter plots using technology (eles-4970) ⃞
Regression lines using technology (eles-4971) ⃞
Using regression lines to make predictions (eles-4972) ⃞
13.6 Describing time series (eles-4973) ⃞
Time series lines of best fit by eye (eles-4974) ⃞
Fluctuations and cycles (eles-0181) ⃞
Interactivities
13.2 Individual pathway interactivity: Bivariate data
(int-4626) ⃞
13.3 Individual pathway interactivity: Lines of best
fit (int-4627) ⃞
Lines of best fit (int-6180) ⃞
Interpolation and extrapolation (int-6181) ⃞
13.5 Individual pathway interactivity: Time series (int-4628) ⃞
13.6 Crossword (int-2887) ⃞
Sudoku puzzle (int-3600) ⃞
Cost ($1000)
3. Independent variable 3.2
4. a. B b. C c. A 3.0
5. D 2.8
6. B
2.6
2.4
7. Interpolation
2.2
16 2.0
8. The gradient of the line is .
11 1.8
9. Independent variable 1.6
10. Explanatory variable 1.4
11. B 0 30 40 50 60 70 80 90100110 120
x
x=6
12. C Number of guests
13. 4. a. Perfectly linear, positive
14. a. See figure at the bottom of the page.* b. No correlation
b. The number of COVID-19 cases started rising in March c. Non-linear, negative, moderate
and peaked in April, then started to decline until June.
d. Strong, positive, linear
There was an increase in cases in July and the cases
reached peak again in August. Cases then started to e. No correlation
a. y = 14 b. x = 12 · 5
decline again until December. 5. a. Non-linear, positive, strong
15. b. Strong, negative, negative
c. Non-linear, moderate, negative
Exercise 13.2 Bivariate data d. Weak, negative, linear
1. e. Non-linear, moderate, positive
Independent Dependent
6. a. Positive, moderate, linear
a. Number of hours Test results
b. Non-linear, strong, negative
b. Rainfall Attendance
c. Strong, negative, linear
c. Hours in gym Visits to the doctor
d. Weak, positive, linear
d. Lengths of essay Memory taken
e. Non-linear, moderate, positive
2. Independent Dependent
a. Cost of care Attendance
b. Age of property Cost of property
c. Number of applicants Cut-off OP score
d. Running speed Heart rate
*14. a. y
400
New Covid-19 cases
350
300
250
200
150
100
50
0 x
ch
il
ay
er
r
Ju
us
be
be
be
Ju
pr
ob
ar
ug
em
em
em
A
M
ct
A
pt
ov
ec
Se
D
N
Month
9 70
8 60
7 50
6 40
5 30
4 20
3 10
2 x
0 1 2 3 4 5 6 7 8 9 10
1
Number of questions completed
0 x
30 35 40 45 50 55 60 65 70 75 80 b. Strong, positive, linear correlation
Cost ($)
c. Various answers; some students are of different ability
b. Negative, linear, moderate. The price of the bag levels and they may have attempted the questions but had
appeared to affect the numbers sold; that is, the more incorrect answers.
expensive the bag, the fewer sold.
10. a. y
Number of accidents
8. a. y 6
420 5
400 4
380 3
360 2
340 1
Price ($1000)
320
0 x
300 5 10 15 20 25 30 35 40
280 Number of lessons
260 b. Weak, negative, linear relation
240 c. Various answers; some drivers are better than others, live
220 in lower traffic areas, traffic conditions etc.
200
11. B
180
160 12. C
140 13. D
*14. a. y
12
A C
10
Number of handballs
G
8
H
6 D
E
4
B
2
F
0 x
2 4 6 8 10 12 14 16 18 20 22 24 26
Number of kicks
9 125
8 120
7 115
6 110
5 105
4 100
3 95
2 90
1 85
80
0 d 75
10 20 30 40 50 60 70 80 90 100
Distance travelled (km)
C = 18.75n + 56.25
b. Using (1, 75) and (5, 150), the equation is
3. a. i. 510 ii. 315 iii. 125
c. On average, weekly cost of food increases by $18.75 for
y = −13x + 595
b. i. 36.5 ii. 26 iii. 8
140
2.4
130
2.2
120
2.0
110
1.8
100
1.6
Earnings ($)
90
1.4
80
1.2
70
1.0
60
0.8
50 t
30 31 32 33 34 35 36 37 38 39 40
40 Time (weeks)
10 0 1 7 8 9 10 11 12 13 14 15 16 17 18 19 20
9 Age
8 b.
7 8
L = 1.07n + 18.9
1
11. a. See figure at the bottom of the page.*
b. 0 a
1 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Age
c. 24.25 cm; 26.39 cm; 28.53 cm; 30.67 cm; 31.74 cm;
*11. a. L
40
39
38
37
36
35
34
33
32
Length (cm)
31
30
29
28
27
26
25
24
23
22
21
20
19
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 n
Week
a. y = 0.0508x + 2.96
in the online resources.
1. Sample responses can be found in the worked solutions in
13.
the online resources.
c. y = −0.913x + 4.32
b. No correlation
2. Sample responses can be found in the worked solutions in
3. a. $174 b. $418
the online resources.
d. Correlation is linear, negative and moderate/strong.
c. a. Reliable, but b. not reliable. e. Regression line from part c.
a. y = 2.8x
4. a. Sample responses can be found in the worked solutions 14. 128 cm
in the online resources. 15. b. Gradient doubles.
b. Sample responses can be found in the worked solutions c. Add 5
in the online resources. 16. a. Sample responses can be found in the worked solutions
c. 3.8. hours d. 2.5 hours in the online resources.
e. c. not reliable, d. reliable
b. Sample responses can be found in the worked solutions
5. a. Sample responses can be found in the worked solutions in the online resources.
in the online resources.
b. Sample responses can be found in the worked solutions Exercise 13.5 Time series
in the online resources. 1. a. Linear, downward
c. 26 kg d. 19 kg
b. Non-linear, upward
y = 0.553x + 31
e. c. not reliable, d. reliable
c. Non-linear, stationary in the mean
6. a. b. 72.4°C
d. Linear, upward
y = 0.55x + 5.3
c. Not possible. Extrapolation not reliable.
2. a. Non-linear, downward
7. a. b. 14
b. Non-linear, downward
c. No because no connection.
c. Non-linear, downward
8. a. Sample responses can be found in the worked solutions
d. Linear, upward
in the online resources.
3. a. See figure at the bottom of the page.*
c. y = 0.173x + 49.1 d. $50
b. Yard size independent, price dependent
b. Linear downward trend
16.6
16.4
16.2
16.0
15.8
15.6
15.4
15.2
15.0
14.8
14.6
14.4
14.2
14.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 x
Day
Enrolment
105 70
Sales (× $1000)
100 60
95 50
90 40
85 30
80 20
75 10
70
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 y
65
60 Week
0 x 8. a.
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
14
2006 2007 2008 2009
13
Number of children
Quarter year
12 (8, 11)
b. Sheepskin products more popular in the third quarter 11
(presumably winter) — discount sales, increase in sales, 10
and so on. 9
c. No trend. 8 (1, 7)
5. a. See figure at the bottom of the page.* 7
6
b. General upward trend with peaks around December and
5
troughs around April.
ne
ly
ug
p
ct
ov
ec
n
b
ar
M r
ay
p
Se
Ja
Fe
c. Peaks around Christmas where people have lots of
O
Ju
M
A
D
Ju
N
parties, troughs around April where weather gets colder Time (month)
and people less inclined to go out. b. Yes, the graph shows an upward trend.
y= x+
d. Yes. Peaks in December, troughs in April. 4 45
6. a. Peaks around Christmas holidays and a minor peak at c.
7 7
Easter. No camping in colder months.
d. i. 15 ii. 18
b. Sample responses can be found in the worked solutions
e. The assumption made was that business will continue on
in the online resources.
a linear upward trend.
9. The trend is non-linear, therefore unable to forecast
future sales.
*5. a. y
100
90
80
Revenue ($1000)
70
60
50
40
35
0 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 x
2007 2008 2009
Month Year
b. Sample responses can be found in the worked solutions the online resources. Students should draw a line of best fit
in the online resources. and predict the next few data values.
7. Sample responses can be found in the worked solutions
c. Sample responses can be found in the worked solutions
in the online resources. in the online resources. Students should take the actual
measurements during the hours they have made predictions
d. Approximately 920 seconds.
for and then compare the predictions with the actual
e. Approximately 15.3 minutes. measurements. Also comment on the accuracy of your
12. a. See bottom of the page* predictions.
b. Secondhand car sales per quarter have shown a general
upward trend but with some major fluctuations. Exercise 13.6 Review questions
c. More cars are sold in the third and fourth quarters 1. a. Number of questions: independent;
compared to the first and second quarters. test result: dependent
13. a. Sample responses can be found in the worked solutions b. y
in the online resources. 100
b. Trend: non-linear, increasing; Pattern: seasonal 90
80
Test result
Project 70
60
1. Sample responses can be found in the worked solutions in
50
the online resources. Students could choose any subject
40
given in the list that can be observed and measure for one
30
day or over the period of a week or more.
20
2. Sample responses can be found in the worked solutions in 10
the online resources. Students need to create a data table
0 x
for their recording. Students should use appropriate regular 10 15 20 25 30 35 40 45 50 55 60
time intervals. Number of questions
3. Sample responses can be found in the worked solutions in Strong, positive, linear correlation; the larger the number
c.
the online resources. For a selected subject, student’s need of completed revision questions, the higher the mark on
to take their measurements at the selected time intervals and the test.
record them in the table. d. Different abilities of the students
4. Sample responses can be found in the worked solutions in 2. a. i. 12.5 ii. 49
the online resources. Students could use Excel or CAS to
b. i. 12 ii. 22.5
plot the time series.
*12. a. y
140
135
130
125
120
115
Cars sold
110
105
100
95
90
85
80
75
70
65
0 x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2012 2013 2012
Quarter year
c. $64 000
square metre.
c. $18
received for pocket money.
a. y = 3.31x + 3.05
becomes large is unrealistic.
7.
b. Approximately 53 hours.
8. Overall the data appears to be following a seasonal trend,
with peaks at either end of the year and a trough in
the middle.
9. a. Outliers can unfairly skew data and as such dramatically
alter the line of best fit. Identify and remove any outliers
from the data before determining the line of best fit.
b. Extrapolation involves making estimates outside the
data range and this is considered unreliable. When
extrapolation is required, consider the data and the
likelihood that the data would remain linear if extended.
When giving results, make comment on the validity of
the estimation.
c. A small range may not give a fair indication if a data
set shows a strong linear correlation. Try to increase the
range of the data set by taking more measurements or
undertaking more research.
d. A small number of data points may not be able to
establish with confidence the existence of a strong linear
correlation. Try to increase the number of data points
by taking more measurements or undertaking more
research.