Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views58 pages

C 13 Bi Variate Data

The document provides an overview of bivariate data, explaining its significance in various fields such as science and business for making predictions based on two variables. It covers concepts like scatter plots, correlation types, and the relationship between independent and dependent variables, along with examples and exercises for practical understanding. Additionally, it emphasizes the importance of analyzing bivariate data to draw conclusions about correlations, while cautioning against inferring causation.

Uploaded by

ashleywongwuiyin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views58 pages

C 13 Bi Variate Data

The document provides an overview of bivariate data, explaining its significance in various fields such as science and business for making predictions based on two variables. It covers concepts like scatter plots, correlation types, and the relationship between independent and dependent variables, along with examples and exercises for practical understanding. Additionally, it emphasizes the importance of analyzing bivariate data to draw conclusions about correlations, while cautioning against inferring causation.

Uploaded by

ashleywongwuiyin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

13 Bivariate data

LEARNING SEQUENCE
13.1 Overview ...............................................................................................................................................................830
13.2 Bivariate data ......................................................................................................................................................834
13.3 Lines of best fit by eye .................................................................................................................................... 844
13.4 Linear regression using technology (10A) ................................................................................................855
13.5 Time series .......................................................................................................................................................... 863
13.6 Review ................................................................................................................................................................... 871
13.1 Overview
Why learn this?
Bivariate data can be collected from all kinds of place. This includes
data about the weather, data about athletic performance and data
about the profitability of a business. By learning the tools you need
to analyse bivariate data, you will be gaining skills that help you turn
numbers (data) into powerful information that can be used to make
predictions (plots).
The use for bivariate data is not limited to the classroom; in fact, many
professionals rely on bivariate data to help make decisions. Some
examples of bivariate data in the real world are:
• when a new drug is created, scientists will run drug trials in
which they collect bivariate data about how the drug works.
When the drug is approved for use, the results of the scientific
analysis help guide doctors, nurses, pharmacists and patients as
to how much of the drug to use and how often.
• manufacturers of products can use bivariate data about sales to
help make decisions about when to make products and how many
to make. For example, a beach towel manufacturer would know
that they need to produce more towels for summer, and analysis
of the data would help them decide how many to make.

By studying bivariate data you can learn how to use data to make predictions. By studying and understanding
how these predictions work, you will be able to understand the strengths and limitations of these types of
predictions.

Where to get help


Go to your learnON title at www.jacplus.com.au to access the following digital resources. The Online
Resources Summary at the end of this topic provides a full list of what’s available to help you learn the
concepts covered in this topic.

Fully worked
Video Interactivities
solutions
eLessons
to every
question

Digital
eWorkbook
documents

830 Jacaranda Maths Quest 10 + 10A


Exercise 13.1 Pre-test
Complete this pre-test in your learnON title at www.jacplus.com.au and receive automatic marks,
immediate corrective feedback and fully worked solutions.
1. MC Choose the following graphs that shows whether there is a relationship between two variables and
each data value is shown as a point on a Cartesian plane.
A. Box plot B. Scatterplot C. Dot plot D. Ogive E. Histogram

2. MC Select which of the following statements is incorrect.


A. Bivariate data are data with two variables.
B. Correlation describes the strength, the direction and the form of the relationship between
two variables.
C. The independent variable is placed on the y-axis and the dependent variable on the x-axis.
D. The dependent variable is the one whose value depends on the other variable.
E. The independent variable takes on values that do not depend on the value of the other variable.

3. Data is compared from twenty students on the number of hours spent studying for an examination and
the result of the examination. State if the number of hours spent studying is the independent or
dependent variable.

4. Match the type of correlation with the data shown on the scatter plots.

Scatter plot Type of correlation


a. y A. Strong negative linear correlation

b. y B. No correlation

c. y C. Weak positive linear correlation

TOPIC 13 Bivariate data 831


5. MC The table below shows the number of hours spent doing a problem-solving task for a subject and
the corresponding total score for task.

Task score %
Number of hours spent on task 0 1.5 2 1 2 1.5 2.5 3 2 2.5
20 50 60 45 80 70 75 97 85 20
Choose which data point is a possible outlier.
A. (0, 20) B. (1.5, 50) C. (1.5, 70) D. (2.5, 20) E. (2.5, 70)

6. MC Each point on the scatterplot shows the Exercising and fitness levels
number of hours per week spent exercising by a y
person and their fitness level. 3.5

Choose the statement that best describes the 3


scatterplot.
A. The more time exercising the worse the 2.5
fitness level.

Fitness levels
B. The number of hours per week spent exercising 2
is the independent variable.
C. The correlation between the number of hours 1.5
per week exercising and the fitness levels is a
weak positive non-linear correlation. 1
D. There are six people’s information collected.
E. There is an outlier. 0.5

0 0.5 1 1.5 2 2.5 3 3.5 x


Number of hours per week exercising

7. Select a term that describes a line of best fit being used y


(12, 20)
to predict a value of a variable from within a given range 20
from the following options: extrapolation, interpolation or 18
regression. 16
14
12
8. Determine the gradient of the line of best fit shown in the
10
scatter plot.
8
6
9. In time series data, explain whether time is the independent 4
or dependent variable. 2
(1, 4)

0 1 2 3 4 5 6 7 8 9 10 11 12 x
10. Select another term for an independent variable from the
following options: a response variable or an explanatory
variable.

11. MC Select the correct difference between a seasonal pattern and a cyclical pattern in a time series plot.
A. A cyclical pattern shows upward trends, where as a seasonal pattern shows only downward trends.
B. A cyclical pattern displays fluctuations with no regular periods between peaks, where as a seasonal
pattern displays fluctuations that repeat at the same time each week, month, quarter or year.
C. A cyclical pattern does not show any regular fluctuations, where as a seasonal pattern does.
D. A seasonal pattern displays fluctuations with no regular periods between peaks, where as a cyclical
pattern displays fluctuations that repeat at the same time each week, month, quarter or year.
E. A seasonal pattern shows upward trends, where as a cyclical pattern shows only downward trends.

832 Jacaranda Maths Quest 10 + 10A


12. MC The time series plot shown can be classified as:
A. upward trend
B. downward trend
C. seasonal pattern
D. cyclical pattern
E. random pattern

y
10

6
Data

0 x
10 20 30 40 50
Time

13. Use the given scatterplot and the line of best fit to determine the value of x when y = 2.

y
7

0 1 2 3 4 5 6 7 x

TOPIC 13 Bivariate data 833


14. The table below shows the number of new COVID-19 cases per month reported in Australia in 2020.

Month March April May June July


New COVID-19 cases 11 304 14 9 86
Month August September October November December
New COVID-19 cases 377 73 18 5 8

a. Plot the time series.


b. Interpret the trend in the data from March to December.

15. Use the given scatterplot and line of best fit y

a. the value of y when x = 4


to predict: 18

b. the value of x when y = 1.


16
14
12
10
8
6
4
2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 x

13.2 Bivariate data


LEARNING INTENTION
At the end of this subtopic you should be able to:
• recognise the independent and dependent variables in bivariate data
• represent bivariate data using a scatter plot
• describe the correlation between two variables in a bivariate data set
• draw conclusions about the correlation between two variables in a bivariate data set.

13.2.1 Bivariate data


eles-4965
• Bivariate data are data with two variables (the prefix ‘bi’ means ‘two’).
•For example, bivariate data could be used to investigate the question: ‘How are student marks affected by
phone use?’
• In bivariate data, one variable will be the independent variable (also known as the experimental variable
or explanatory variable). This variable is not impacted by the other variable.
• In the example the independent variable is phone use per day.
• In bivariate data one variable will be the dependent variable (also known as the response variable). This
variable is impacted by the other variable.
• In the example the dependent variable is average student marks.

Scatter plots
• A scatter plot is a way of displaying bivariate data.
• A scatter plot will have:
• the independent variable placed on the x-axis with a label and scale
• the dependent variable placed on the y-axis with a label and scale
• the data points shown on the plot.

834 Jacaranda Maths Quest 10 + 10A


Features of a scatter plot

Average student marks (%)


100
Data points plotted
75
Dependent data
on y-axis 50

25

0 1 2 3 4 x
Time spent on phone per day (hours) Independent data
on x-axis

WORKED EXAMPLE 1 Representing bivariate data on a scatter plot

The table shows the total revenue from selling tickets for a number of different chamber music
concerts. Represent the given data on a scatterplot.

Number of
400 200 450 350 250 300 500 400 350 250
tickets sold

revenue $
Total ( )
8000 3600 8500 7700 5800 6000 11 000 7500 6600 5600

THINK WRITE/DRAW
1. Determine which is the dependent variable The total revenue depends on the number of tickets
and which is the independent variable. being sold, so the total revenue is the dependent
variable and the number of tickets in the independent
variable.

Total revenue from selling tickets


2. Draw a set of axes. Label the title of the graph.
Label the horizontal axis ‘Number of tickets 11 000
sold’ and the vertical axis ‘Total revenue ($)’. 10 000
Total revenue ($)

9000
3. Use an appropriate scale on the horizontal and
8000
vertical axes.
7000
4. Plot the points on the scatterplot. 6000
5000
4000
3000
0 200 250 300 350 400 450 500
Number of tickets sold

TOPIC 13 Bivariate data 835


13.2.2 Correlation
eles-4966
• Correlation is a way of describing a connection between variables in a bivariate data set.

Describing correlation

Correlation between two variables will have:


• a type (linear or non-linear)

Linear Non-linear

• a direction (positive or negative)

Positive Negative

• a strength (strong, moderate or weak).

Strong Moderate Weak

• Data will have no correlation if the data are spread out across the plot with no clear pattern, as shown
in this example.

No correlation

836 Jacaranda Maths Quest 10 + 10A


WORKED EXAMPLE 2 Describing the correlation

State the type of correlation between the variables x and y, shown on the scatterplot.

THINK WRITE
Carefully analyse the scatterplot and comment The points on the scatterplot are close together
on its form, direction and strength. and constantly increasing therefore the relationship
is linear.
The path is directed from the bottom left corner to
the top right corner and the value of y increases as x
increases. Therefore the correlation is positive.
The points are close together so the correlation can
be classified as strong.
There is a linear, positive and strong relationship
between x and y.

13.2.3 Drawing conclusions from correlation


eles-4967
• When drawing a conclusion from a scatter plot, state how the independent variable appears to affect the
dependent variable and explain what that means.
• For the example of comparing time spent on phone to average marks, a good conclusion for the graph
shown in section 13.2.1 would be:

Independent variable Dependent variable

The number of hours spent on a phone per day appears to affect the average marks .
This means that the more time spent on a phone per day, the worse a student’s marks are likely to be.

Explanation

• Based on scatter plots, it is possible to draw conclusions y


about correlation but not causation.
Distance run (km)

For example, for this graph: 15

• it is correct to say that the amount of water drunk 10


appears to affect the distance run
• it is incorrect to say that drinking more water causes a 5
person to run further, because someone might only be
able to run 1 km and even if they drink 3 litres of water, 0 x
0.5 1.0 1.5 2.0 2.5 3.0
they will still only be able to run 1 km.
Amount of water drunk (L)

TOPIC 13 Bivariate data 837


WORKED EXAMPLE 3 Stating conclusions from bivariate data

Mary sells business shirts in a department store. She always records the number of different styles of
shirt sold during the day. The table below shows her sales over one week.
Price $
( )
14 18 20 21 24 25 28 30 32 35
Number of shirts sold 21 22 18 19 17 17 15 16 14 11

a. Construct a scatterplot of the data.


b. State the type of correlation between the two variables and, hence, draw a corresponding conclusion.

THINK WRITE/DRAW
a. Draw the scatterplot showing ‘Price ($)’ a.
(independent variable) on the horizontal 28

Number of shirts sold


26
axis and ‘Number of shirts sold’ (dependent
24
variable) on the vertical axis.
22
20
18
16
14
12
10
0 5 10 15 20 25 30 35
Price ($)
b. 1. Carefully analyse the scatterplot and b. The points on the plot form a path that resembles
comment on its form, direction and strength. a straight, narrow band, directed from the top
left corner to the bottom right corner. The points
are close to forming a straight line. There is a
linear, negative and strong correlation between the
two variables.
2. Draw a conclusion corresponding to the The price of the shirt appears to affect the number
analysis of the scatterplot. sold; that is, the more expensive the shirt the
fewer sold.

TI | THINK DISPLAY/WRITE CASIO | THINK DISPLAY/WRITE


a−b. a−b. a−b. a−b.
1. In a new document, 1. On the Statistic
on a Lists & screen, label list1
Spreadsheet page, as ‘Price’ and list 2
label column A as as ‘Shirts’, then
‘price’ and label enter
column B as ‘sold’. the values from the
Enter the values question.
from the question. Press EXE after
entering each value.

838 Jacaranda Maths Quest 10 + 10A


2. Open a Data & 2. Tap:
Statistics page. • SetGraph
Press TAB to locate • Setting...
the label of the Set values as shown
horizontal axis and in the screenshot,
select the variable then tap Set.
‘price’. Press TAB
again to locate the
label of the vertical
axis and select the
variable ‘sold’.

3. To change the colour 3. Tap the graphing


of the scatterplot, icon and the
place the pointer over scatterplot will
one of the data points. appear.
Then press CTRL
MENU. Press:
• Colour
• Fill Colour
Select a colour from
the palette for the
scatterplot. Press The scatterplot is shown, using
ENTER. a suitable scale for both axes.
The points are close to forming a
straight line. There is a strong The scatterplot is shown,
negative, linear correlation using a suitable scale for
between the two variables. The both axes. The points
trend indicates that the price are close to forming a
of a shirt appears to affect the straight line. There is a
number sold; that is, the more strong negative linear
expensive the shirt, the fewer are correlation between the
sold. two variables. The trend
indicates that the price of
a shirt appears to affect
the number sold; that is,
the more expensive the
shirt, the fewer are sold.

DISCUSSION
How could you determine whether the change in one variable causes the change in another variable?

Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Digital documents SkillSHEET Substitution into a linear rule (doc-5405)
SkillSHEET Solving linear equations that arise when finding x- and y-intercepts (doc-5406)
SkillSHEET Transposing linear equations to standard form (doc-5407)
SkillSHEET Measuring the rise and the run (doc-5408)
SkillSHEET Determining the gradient given two points (doc-5409)
SkillSHEET Graphing linear equations using the x- and y-intercept method (doc-5410)
SkillSHEET Determining independent and dependent variables (doc-5411)
SkillSHEET Determining the type of correlation (doc-5413)
Interactivity Individual pathway interactivity: Bivariate data (int-4626)

TOPIC 13 Bivariate data 839


Exercise 13.2 Bivariate data
Individual pathways
PRACTISE CONSOLIDATE MASTER
1, 4, 5, 8, 11, 14 2, 6, 9, 12, 15 3, 7, 10, 13, 16

To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.

Fluency
For questions 1 and 2, decide which of the variables is
independent and which is dependent.
1. a. Number of hours spent studying for a Mathematics test and
the score on that test.
b. Daily amount of rainfall (in mm) and daily attendance at the
Botanical Gardens.
c. Number of hours per week spent in a gym and the annual
number of visits to the doctor.
d. The amount of computer memory taken by an essay and the
length of the essay (in words).

2. a. The cost of care in a childcare centre and attendance at the childcare centre.
b. The cost of the property (real estate) and the age of the property.
c. The entry requirements for a certain tertiary course and the number of applications for that course.
d. The heart rate of a runner and the running speed.

3. WE1 The following table shows the cost of a wedding reception at 10 different venues. Represent the data
on a scatterplot.

Total cost × $1000


No of guests 30 40 50 60 70 80 90 100 110 120
( )
1.5 1.8 2.4 2.3 2.9 4 4.3 4.5 4.6 4.6

4. WE2 State the type of relationship between x and y for each of the following scatterplots.
a. y b. y c. y

x x x
d. y e. y

x x

840 Jacaranda Maths Quest 10 + 10A


5. State the type of relationship between x and y for each of the following scatterplots.
a. y b. y c. y

x x x
d. y e. y

x x
6. State the type of relationship between x and y for each of the following scatterplots.
a. y b. y c. y

x x x
d. y e. y

x x

Understanding
7. WE3 Eugene is selling leather bags at the local market. During the day he keeps records of his sales. The
table below shows the number of bags sold over one weekend and their corresponding prices (to the
nearest dollar).

Price ($) of a bag 30 35 40 45 50 55 60 65 70 75 80


Number of bags sold 10 12 8 6 4 3 4 2 2 1 1

a. Construct a scatterplot of the data.


b. State the type of correlation between the two variables and, hence, draw a corresponding conclusion.

8. The table below shows the number of bedrooms and the price of each of 30 houses.

($′ 000) ($′ 000) ($′ 000)


Number of Price Number of Price Number of Price
bedrooms bedrooms bedrooms
2 180 3 279 3 243
2 160 2 195 3 198
3 240 6 408 3 237
2 200 4 362 2 226
2 155 2 205 4 359
4 306 7 420 4 316
3 297 5 369 2 200
5 383 1 195 2 158
2 212 3 265 1 149
4 349 2 174 3 286

TOPIC 13 Bivariate data 841


a. Construct a scatterplot of the data.
b. State the type of correlation between the number of bedrooms and the price of the house and, hence, draw
a corresponding conclusion.
c. Suggest other factors that could contribute to the price of the house.

9. The table below shows the number of questions solved by each student on a test, and the corresponding total
score on that test.
Number of questions 2 0 7 10 5 2 6 3 9 4 8 3 6
Total score (%) 22 39 69 100 56 18 60 36 87 45 84 32 63

a. Construct a scatterplot of the data.


b. Suggest the type of correlation shown in the scatterplot.
c. Give a possible explanation as to why the scatterplot is not perfectly linear.

10. A sample of 25 drivers who had obtained a full licence within the last month
was asked to recall the approximate number of driving lessons they had taken
(to the nearest 5), and the number of accidents they had while being on P plates.
The results are summarised in the table that follows.
a. Represent these data on a scatterplot.
b. Specify the relationship suggested by the scatterplot.
c. Suggest some reasons why this scatterplot is not perfectly linear.

Number of lessons 5 20 15 25 10 35 5 15 10 20 40 25 10
Number of accidents 6 2 3 3 4 0 5 1 3 1 2 2 5
Number of lessons 5 20 40 25 30 15 35 5 30 15 20 10
Number of accidents 5 3 0 4 1 4 1 4 0 2 3 4

Reasoning
11. MC The scatterplot that best represents the relationship between the amount of water consumed daily by a
certain household for a number of days in summer and the daily temperature is:
A. B. C.
Temperature (°C)

Water usage (L)

Water usage (L)

Water usage (L) Temperature (°C) Temperature (°C)

D. E.
Temperature (°C)
Water usage (L)

Temperature (°C) Water usage (L)

842 Jacaranda Maths Quest 10 + 10A


12. MC The scatterplot shows the number of sides and the sum of interior y
angles for a number of polygons. 1300
Select the statement that is NOT true of the following statements. 1200
1100
A. The correlation between the number of sides and the angle sum of

Sum of angles (°)


1000
the polygon is perfectly linear. 900
B. The increase in the number of sides causes the increase in the size of 800
the angle sum. 700
C. The number of sides depends on the sum of the angles. 600
D. The correlation between the two variables is positive. 500
E. There is a strong correlation between the two variables. 400
300
13. MC After studying a scatterplot, it was concluded that there was 200
evidence that the greater the level of one variable, the smaller the level
0 3 4 5 6 7 8 9 10 x
of the other variable. The scatterplot must have shown a:
Number of sides
A. strong, positive correlation
B. strong, negative correlation
C. moderate, positive correlation
D. moderate, negative correlation
E. weak, negative correlation

Problem solving
14. The table below gives the number of kicks and handballs obtained by the top 8 players in an AFL game.

Player A B C D E F G H
Number of kicks 20 27 21 19 17 18 21 22
Number of handballs 11 3 11 6 5 1 9 7

a. Represent this information on a scatterplot by using the


x-axis as the number of kicks and the y-axis as the number
of handballs.
b. State whether the scatterplot supports the claim that the
more kicks a player obtains, the more handballs they give.

15. Each point on the scatterplot shows the time (in weeks) spent by a person on a healthy diet and the
corresponding mass lost (in kg).
Study the scatterplot and state whether each of the following statements is true or false.
a. The number of weeks that the person stays on a diet is the independent
variable.
Loss in mass (kg)

b. The y-coordinates of the points represent the time spent by a person on


a diet.
c. There is evidence to suggest that the longer the person stays on a diet,
the greater the loss in mass.
d. The time spent on a diet is the only factor that contributes to the loss
in mass. Number of weeks
e. The correlation between the number of weeks on a diet and the number
of kilograms lost is positive.

TOPIC 13 Bivariate data 843


16. The scatterplot shown gives the marks obtained by
(iii)
students in two mathematics tests. Mardi’s score in
(v)
the tests is represented by M. Determine which point (ii) (iv)
represents each of the following students.
a. Mandy, who got the highest mark in both tests. (vi) M

Test 2
b. William, who got the top mark in test 1 but not in
(i) (viii)
test 2.
c. Charlotte, who did better on test 1 than Mardi but not
(vii)
as well in test 2.
d. Dario, who did not do as well as Charlotte in
both tests.
e. Edward, who got the same mark as Mardi in test 2
Test 1
but did not do so well in test 1.
f. Cindy, who got the same mark as Mardi for test 1 but did better than her for test 2.
g. Georgina, who was the lowest in test 1.
h. Harrison, who had the greatest discrepancy between his two marks.

13.3 Lines of best fit by eye


LEARNING INTENTION
At the end of this subtopic you should be able to:
• draw a line of best fit by eye
• determine the equation of the line of best fit
• use the line of best fit to make interpolation or extrapolation predictions.

13.3.1 Lines of best fit by eye


Average student marks (%)

eles-4968
• A line of best fit is a line that follows the trend of the data in a 100
scatter plot.
• A line of best fit is most appropriate for data with strong or 75
moderate linear correlation.
• Drawing lines of best fit by eye is done by placing a line that: 50
• represents the data trend
25
• has an equal number of points above and below the line.

0 1 2 3 4
Time spent on phone per day (hours)

Determine the equation of a line of best fit by eye

To determine the equation of the line of best fit, follow these steps:
1. Choose two points on the line.

2. Write the points in the coordinate form (x1 , y1 ) and (x2 , y2 ).


Note: It is best to use two data points on the line if possible.

y2 − y1
3. Calculate the gradient using m =
x2 − x1
.

4. Write the equation in the form y = mx + c using the m found in step 3.


5. Substitute one coordinate into the equation and rearrange to find c.
6. Write the final equation, replacing x and y if needed.

844 Jacaranda Maths Quest 10 + 10A


WORKED EXAMPLE 4 Determining the equation for a line of best fit

The data in the table shows the cost of using the internet at a number of different internet cafes based
on hours used per month.

Hours used per month 10 12 20 18 10 13 15 17 14 11


Total monthly cost ($) 15 18 30 32 18 20 22 23 22 18

a. Construct a scatterplot of the data.


b. Draw the line of best fit by eye.
c. Determine the equation of the line of best fit in terms of the variables n (number of hours) and
C (monthly cost).

THINK WRITE/DRAW
a. Draw the scatterplot placing the independent a. y
variable (hours used per month) on the 32
horizontal axis and the dependent variable 30

Total monthly cost ($)


(total monthly cost) on the vertical axis. 28
Label the axes. 26
24
22
20
18
16
14
0 10 11 12 13 14 15 16 17 18 19 20 x
Hours used per month

b. 1. Carefully analyse the scatterplot. b. y


32
2. Position the line of best fit so there is 30
Total monthly cost ($)

approximately an equal number of data points 28 (20, 30)


on either side of the line and so that all points 26
are close to the line. 24
Note: With the line of best fit, there is no 22
single definite solution. 20
(13, 20)
18
16
14
0 10 11 12 13 14 15 16 17 18 19 20 21 x
Hours used per month

c. 1. Select two points on the line that are not too c. Let (x1 , y1 ) = (13, 20) and (x2 , y2 ) = (20, 30).

y2 − y1
close to each other.
m=
x2 − x1
2. Calculate the gradient of the line.

30 − 20
m=
20 − 13

=
10

y = mx + c
7
3. Write the rule for the equation of a straight line.

TOPIC 13 Bivariate data 845


y= x+c
10
4. Substitute the known values into the equation.
7

20 = (13) + c
10
5. Substitute one pair of coordinates,
7

c = 20 −
say (13, 20) into the equation to evaluate c.
130

140 − 130
7

=
7

=
10
7

y= x+
10 10
6. Write the equation.
Note: The values of c and m are the same in 7 7
this example. This is not always the case.

C= n+
10 10
7. Replace x with n (number of hours used) and
y with C (the total monthly cost) as required. 7 7

13.3.2 Predictions using lines of best fit


eles-4969
• Predictions can be made by using the line of best fit.
• To make a prediction, use one coordinate then determine the other coordinate by using:
the line of best fit

the equation of the line of best fit.

• Predictions will be made using:
• interpolation if the prediction sits within the given data
• extrapolation if the prediction sits outside the given data.

Interpolation vs extrapolation
Predictions made within
the data use interpolation.
y
Average student marks (%)

100

75

50 Predictions made outside


the data use extrapolation.
25

0 1 2 3 4 5
Time spent on phone per day (hours)

• Predictions will be reliable if they are made:


• using interpolation
• from data with a strong correlation
• from a large number of data.

846 Jacaranda Maths Quest 10 + 10A


WORKED EXAMPLE 5 Making predictions using the line of best fit

a. the value of y when x = 10


Use the given scatterplot and line of best fit to predict:

b. the value of x when y = 10.


y
45
40
35
30
25
20
15
10
5

0 5 10 15 20 25 30 35 40 x

THINK WRITE/DRAW
a. 1. Locate 10 on the x-axis and draw a vertical a. y
line until it meets with the line of best fit. 45
From that point, draw a horizontal line to the 40
35
y-axis. Read the value of y indicated by the
30
horizontal line.
25
20
15
10
5

When x = 10, y is predicted to be 35.


0 x
5 10 15 20 25 30 35 40
2. Write the answer.

b. 1. Locate 10 on the y-axis and draw a horizontal b. y


line until it meets with the line of best fit. 45
From that point draw a vertical line to the 40
35
x-axis. Read the value of x indicated by the
30
vertical line.
25
20
15
10
5
0 x

When y = 10, x is predicted to be 27.


5 10 15 20 25 30 35 40

2. Write the answer.

TOPIC 13 Bivariate data 847


WORKED EXAMPLE 6 Interpreting meaning and making predictions

The table below shows the number of boxes of tissues purchased by hay fever sufferers and the
number of days affected by hay fever during the blooming season in spring.

Number of days affected by hay fever (d) 3 12 14 7 9 5 6 4 10 8


Total number of boxes of tissues
1 4 5 2 3 2 2 2 3 3
purchased (T)

a. Construct a scatterplot of the data and draw a line of best fit.


b. Determine the equation of the line of best fit.
c. Interpret the meaning of the gradient.
d. Use the equation of the line of best fit to predict the number of boxes of tissues purchased by people
suffering from hay fever over a period of:
i. 11 days ii. 15 days.

THINK WRITE/DRAW
a. 1. Draw the scatterplot showing the a. T

Total no. boxes of tissues purchased


independent variable (number of days
affected by hay fever) on the horizontal axis 5
and the dependent variable (total number of
4
boxes of tissues purchased) on the
vertical axis.
3

0 3 4 5 6 7 8 9 10 11 12 13 14 d
No. days affected by hay fever

2. Position the line of best fit on the scatterplot T


Total no. boxes of tissues purchased

so there is approximately an equal number


of data points on either side of the line. 5
(14,5)
4

1
(3,1)

0 3 4 5 6 7 8 9 10 11 12 13 14 d
No. days affected by hay fever

b. 1. Select two points on the line that are not too b. Let (x1 , y1 ) = (3, 1) and (x2 , y2 ) = (14, 5).

y2 − y1
close to each other.
m=
x2 − x1
2. Calculate the gradient of the line.

5−1
m= =
14 − 3 11
4

848 Jacaranda Maths Quest 10 + 10A


3. Write the rule for the equation of a y = mx + c
straight line.

y= x+c
4
4. Substitute the known values into the
11

1= (3) + c
equation, say (3, 1), into the equation to
calculate c. 4
11
c = 1−
12

−1
11
=
11

y= x−
4 1
5. Write the equation.
11 11

T= d−
4 1
6. Replace x with d (number of days with hay
fever) and y with T (total number of boxes of 11 11
tissues used) as required.
c. Interpret the meaning of the gradient of the c. The gradient indicates an increase in sales
line of best fit. of tissues as the number of days affected
by hay fever increases. A hay fever
4
sufferer is using on average (or
11
about 0.36) of a box of tissues per day.

d. i. 1. Substitute the value d = 11 into the d. i. When d = 11,

T= × 11 −
equation and evaluate. 4 1
11 11
= 4−
1
11
=3
10
11
2. Interpret and write the answer. In 11 days the hay fever sufferer will need
4 boxes of tissues.
ii. 1. Substitute the value d = 15 into the ii. When d = 15,

T= × 15 −
equation and evaluate. 4 1
11 11

= −
60 1
11 11

=5
4
11
2. Interpret and write the answer. In 15 days the hay fever sufferer will need
about 6 boxes of tissues.

DISCUSSION
Why is extrapolation not considered to be reliable?

TOPIC 13 Bivariate data 849


Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Interactivities Individual pathway interactivity: Lines of best fit (int-4627)
Lines of best fit (int-6180)
Interpolation and extrapolation (int-6181)

Exercise 13.3 Lines of best fit by eye


Individual pathways
PRACTISE CONSOLIDATE MASTER
1, 2, 5, 8, 11 3, 6, 9, 12 4, 7, 10, 13

To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.

Fluency
1. WE4 The data in the table shows the distances travelled by 10 cars
and the amount of petrol used for their journeys (to the nearest
litre).

Distance travelled
52 36 83 12 44 67 74 23 56 95
(km), d
Petrol used (L), P 7 5 9 2 7 9 12 3 8 14

a. Construct a scatterplot of the data and draw the line of best fit.
b. Determine the equation of the line of best fit in terms of the
variables d (distance travelled) and P (petrol used).

2. WE5 Use the given scatterplot and line of best fit to predict:

y
70
60
50
40
30
20
10
0 10 20 30 40 50 60 70 80 90 x

a. the value of y when x = 45


b. the value of x when y = 15.

850 Jacaranda Maths Quest 10 + 10A


3. Analyse the following graph. y
600
a. Use the line of best fit to estimate the 500
value of y when the value of x is: 400
i. 7 ii. 22 iii. 36. 300
200
b. Use the line of best fit to estimate the
100
value of x when the value of y is:
0
x
i. 120 ii. 260 iii. 480. 5 10 15 20 25 30 35 40 45
c. Determine the equation of the line of best fit, if it is known that it passes through the points (5, 530)
and (40, 75).
d. Use the equation of the line to verify the values obtained from the graph in parts a and b.

4. A sample of ten Year 10 students who have part-time jobs was randomly selected. Each student was asked
to state the average number of hours they work per week and their average weekly earnings (to the nearest
dollar). The results are summarised in the table below.

Hours worked, h 4 8 15 18 10 5 12 16 14 6
Weekly earnings ($), E 23 47 93 122 56 33 74 110 78 35

a. Construct a scatterplot of the data and draw the line of best fit.
b. Write the equation of the line of best fit, in terms of variables h (hours worked) and E (weekly earnings).
c. Interpret the meaning of the gradient.

Understanding
5. WE6 The following table shows the average weekly expenditure on food for households of various sizes.

Number of people in a household 1 2 4 7 5 4 3 5


Cost of food ($ per week) 70 100 150 165 150 140 120 155
Number of people in a household 2 4 6 5 3 1 4
Cost of food ($ per week) 90 160 160 160 125 75 135

a. Construct a scatterplot of the data and draw in the line of best fit.
b. Determine the equation of the line of best fit. Write it in terms of
variables n (for the number of people in a household) and C (weekly
cost of food).
c. Interpret the meaning of the gradient.
d. Use the equation of the line of best fit to predict the weekly food
expenditure for a family of:
i. 8 ii. 9 iii. 10.

6. The number of hours spent studying, and the percentage marks obtained by a group of students on a test are
shown in this table.
Hours spent studying 45 30 90 60 105 65 90 80 55 75
Marks obtained 40 35 75 65 90 50 90 80 45 65

a. State the values for marks obtained that can be used for interpolation.
b. State the values for hours spent studying that can be used for interpolation.

TOPIC 13 Bivariate data 851


7. The following table shows the gestation time and the birth mass of 10 babies.

Gestation time (weeks) 31 32 33 34 35 36 37 38 39 40


Birth mass (kg) 1.080 1.470 1.820 2.060 2.230 2.540 2.750 3.110 3.080 3.370

a. Construct a scatterplot of the data. Suggest the type of correlation shown by the scatterplot.
b. Draw in the line of best fit and determine its equation. Write it in terms of the variables t (gestation time)
and M (birth mass).
c. Determine what the value of the gradient represents.
d. Although full term of gestation is considered to be 40 weeks, some pregnancies last longer. Use the
equation obtained in part b to predict the birth mass of babies born after 41 and 42 weeks of gestation.
e. Many babies are born prematurely. Using the equation obtained in part b, predict the birth mass of a baby
whose gestation time was 30 weeks.
f. Calculate their gestation time (to the nearest week), if the birth mass of the baby was 2.390 kg.

Reasoning
8. MC Consider the scatterplot shown.
y

x
0 10 20 30 40 50 60 70

The line of best fit on the scatterplot is used to predict the values of y when x = 15, x = 40 and x = 60.
a. Interpolation would be used to predict the value of y when the value of x is:

A. 15 and 40 B. 15 and 60 C. 15 only D. 40 only E. 60 only

A. x = 15 and x = 40 B. x = 15, x = 40 and x = 60 C. x = 40


b. The prediction of the y-value(s) can be considered reliable when:

D. x = 40 and x = 60 E. x = 60

9. MC The scatterplot below is used to predict the value of y when x = 300. This prediction is:
y
500
400
300
200
100
0 x
100 200 300 400 500 600 700

A. reliable, because it is obtained using interpolation


B. not reliable, because it is obtained using extrapolation
C. not reliable, because only x-values can be predicted with confidence
D. reliable because the scatterplot contains a large number of points
E. not reliable, because there is no correlation between x and y

852 Jacaranda Maths Quest 10 + 10A


10. As a part of her project, Rachel is growing a crystal. Every day she measures the crystal’s mass using special
laboratory scales and records it. The table below shows the results of her experiment.

Day number 1 2 3 4 5 8 9 10 11 12 15 16
Mass (g) 2.5 3.7 4.2 5.0 6.1 8.4 9.9 11.2 11.6 12.8 16.1 17.3

Measurements on days 6, 7, 13 and 14 are missing, since these were 2 consecutive weekends and, hence,
Rachel did not have a chance to measure her crystal, which is kept in the school laboratory.
a. Construct the scatterplot of the data and draw in the line of best fit.
b. Determine the equation of the line of best fit. Write the equation,
using variables d (day of the experiment) and M (mass of the crystal).
c. Interpret the meaning of the gradient.
d. For her report, Rachel would like to fill in the missing measurements
(that is, the mass of the crystal on days 6, 7, 13 and 14). Use the
equation of the line of best fit to help Rachel determine these
measurements. Explain whether this is an example of interpolation
or extrapolation.
e. Rachel needed to continue her experiment for 2 more days, but she
fell ill and had to miss school. Help Rachel to predict the mass of the
crystal on those two days (that is, days 17 and 18), using the equation
of the line of best fit. Explain whether these predictions are reliable.

Problem solving
11. Ari was given a baby rabbit for his birthday. To monitor the rabbit’s growth, Ari decided to measure it once
a week.

The table below shows the length of the rabbit for various weeks.

Week number, n 1 2 3 4 6 8 10 13 14 17 20
Length (cm), l 20 21 23 24 25 30 32 35 36 37 39

a. Construct a scatterplot of the data.


b. Draw a line of best fit and determine its equation.
c. As can be seen from the table, Ari did not measure his rabbit on weeks 5, 7, 9, 11, 12, 15, 16, 18 and 19.
Use the equation of the line of best fit to predict the length of the rabbit for those weeks.
d. Explain whether the predictions made in part c were an example of interpolation or extrapolation.
e. Predict the length of the rabbit in the next three weeks (that is, weeks 21–23), using the line of best fit
from part c.
f. Explain whether the predictions that have been made in part e are reliable.

TOPIC 13 Bivariate data 853


12. Laurie is training for the long jump, hoping to make the Australian Olympic team. His best jump each year
is shown in the table below.
Age (a) Best jump (B) (metres)
8 4.31
9 4.85
10 5.29
11 5.74
12 6.05
13 6.21
14 —
15 6.88
16 7.24
17 7.35
18 7.57

a. Plot the points generated by the table on a


scatterplot.
b. Join the points generated with straight line
segments.
c. Draw a line of best fit and determine its equation.
d. The next Olympic Games will occur when Laurie
is 20 years old. Use the equation of the line of
best fit to estimate Laurie’s best jump that year
and whether it will pass the qualifying mark of
8.1 metres.
e. Explain whether a line of best fit is a good way to
predict future improvement in this situation. State
the possible problems are there with using a line of
best fit.
f. Olympic Games will also be held when Laurie is
24 years old and 28 years old. Using extrapolation,
what length would you predict Laurie could jump
at these two ages? Discuss whether this is realistic.
g. When Laurie was 14, he twisted a knee in training and did not compete for the whole season. In that year,
a national junior championship was held. The winner of that championship jumped 6.5 metres. Use your
line of best fit to predict whether Laurie would have won that championship.
13. Sam has a mean score of 88 per cent for his first

an A+ his score must be 90 per cent or higher.


nine tests of the semester. In order to receive

There is one test remaining for the semester.

an A+ .
Explain whether it is possible for him to receive

854 Jacaranda Maths Quest 10 + 10A


13.4 Linear regression using technology (10A)
LEARNING INTENTION
At the end of this subtopic you should be able to:
• display a scatter plot using technology
• determine the equation of the regression line using technology
• display a scatter plot with its regression line using technology
• use a regression line to make predictions.

13.4.1 Scatter plots using technology


eles-4970
• Scatter plots can be displayed using graphics calculators and spreadsheets.
• To display a scatter plot using technology:
• first input the data with the independent variable in the first column and the dependent variable in the
second column
• then use the technology to draw the plot.

WORKED EXAMPLE 7 Displaying a scatter plot using technology

The following data shows the amount of time (hours) and the amount of distance walked (km) on a
bushwalk. Display the data on a scatter plot using technology.

Time, hours (x) 1 2 3 4


Distance, km (y) 3.11 4.73 6.08 7.54

THINK WRITE
1. Determine which data will go Time is the independent variable – it will go in the first column.
in which column by identifying Distance is the dependent variable – it will go in the second column.
the independent and dependent
variable.
2. Input the data into the
spreadsheet or calculator.

3. Use the spreadsheet or calculator


to create a scatter plot. Type
your data into the spreadsheet
and highlight all your data
(including headings).
For Google Sheets:
Go to Insert and select Chart.
For Excel:
Go to Insert, select the scatter
plot icon and choose the
Scatter option.

TOPIC 13 Bivariate data 855


TI | THINK DISPLAY/WRITE CASIO | THINK DISPLAY/WRITE
1. In a new problem, on 1. From the menu, select
a Lists & Spreadsheet Spreadsheet. Enter the
page, label column data from the question in
A as ‘time’ and B as the columns A and B.
‘distance’. Enter the data
from the question.

2. Add a page by pressing 2. Highlight both columns


CTRL then DOC and A and B and tap:
select: • Graph
• 2: Add Graphs • Scatter
In the graphs page press:
• MENU
• 3: Graph Entry/Edit
• 6: Scatter Plot

to x ←) press VAR and


On the first line (next

select ‘time’ and on

to y ←) press VAR and


the second line (next

select ‘distance’ then


press ENTER.
To adjust the screen
press:
• MENU
• 4: Window / Zoom
• 9: Zoom – Data

13.4.2 Regression lines using technology


eles-4971
• Regression lines are another name for lines of best fit.
• Using technology it is possible to calculate and sketch a regression line with more accuracy compared to
using a line of best fit by eye.
• For regression lines using technology it is possible to:
•draw the regression line
• determine the equation of the line.
• Note that regression lines are only valid if the independent and dependent variables have a connection.

WORKED EXAMPLE 8 Displaying a regression line using technology

The following data shows the amount of time (hours) and the amount of distance walked (km) on a
bushwalk. Determine the equation of the regression line and display the regression line using
a spreadsheet.

Time, hours (x) 1 2 3 4


Distance, km (y) 3.11 4.73 6.08 7.54

856 Jacaranda Maths Quest 10 + 10A


THINK WRITE
1. Use the scatter plot from
Worked example 7.
Start by sketching the
regression line using the
spreadsheet option called
trendline.

2. On the spreadsheet display The equation is: y = 1.46x + 1.71


the equation by using the
spreadsheet option to show
equations on the graph.

TI | THINK DISPLAY/WRITE CASIO | THINK DISPLAY/WRITE


1. Start with the scatter plot Start with the scatter plot
from Worked example 7. from Worked example 7.
To determine the equation To determine the equation
of the regression line, return of the regression line, tap:
to the spreadsheet page and • Calc
click into the third column. • Regression
Press: • Linear Reg
• MENU
• 4: Statistics
• 1: Stat Calculations

(mx + b)
• 3. Linear Regression

In X List select ‘time’ and in


Y List select ‘distance’, then
click OK.

2. To plot the regression line,


return to the graph page 1.2
and press:
• MENU
• 3: Graph Entry/Edit
• 1: Function

‘f1 (x) = 1.464x + 1.705


Press the up arrow to see

Then press ENTER.

13.4.3 Using regression lines to make predictions


eles-4972
• The regression line can be used to make predictions by using technology.
• Predictions will be reliable if they are made: using interpolation, from data with a strong correlation and
from a large number of data. (see section 13.3.2).

TOPIC 13 Bivariate data 857


Digital technology

1. Predictions from regression lines using spreadsheets


To predict the y-value:
Use the function FORECAST.

= FORECAST(x-value, y-data, x-data)


To use FORECAST type the following into any cell:

= FORECAST (3.5, B2∶B6, A2∶A6)


For example, for the x-value 3.5 type:

and find the y-value is 2.59.


2. Predictions from regression lines using CAS
Start with your scatter plot with a regression line from section 13.4.2.
To predict the y-value:
Press menu then 3: Graph Entry/Edit and choose
6: Scatterplot and press up to see s1. In s1 untick the blue box. Press
menu then 5: Trace then select 1: Graph Trace. The type the x-value and
press enter. The coordinate will appear.
For the example, when x is 3.5, y is 2.59.

WORKED EXAMPLE 9 Using regression line to make predictions

For the following data:


a. use technology to predict the distance after 2.2 hours
b. use technology to predict the distance after 6.4 hours
c. explain whether these predictions are reliable.

Time, hours (x) 1 2 3 4


Distance, km (y) 3.11 4.73 6.08 7.54

THINK WRITE
a. 1. Use the spreadsheet or CAS pages set up in See Worked example 8.
Worked example 8.

= FORECAST (2.2, B2∶B5, A2∶A5)


2. We need to calculate the distance value, Into a spreadsheet type:
which is the y-value.

Distance = 4.93 km
For a spreadsheet use the function Note the order B2:B5 first and A2:A5 second.
FORECAST.
OR OR
For a CAS start on the graph page and use the Using a CAS select Trace, type the value 2.2
Trace tool. and press enter.

Distance = 4.93 km
Note: the scatter plot must be turned off.

= FORECAST (6.4, B2∶B5, A2∶A5)


b. 1. We need to calculate the distance value which Into a spreadsheet type:
is the y-value.

Distance = 11.07 km
For a spreadsheet use the function Note the order B2:B5 first and A2:A5 second.
FORECAST
OR OR
For a CAS start on the graph page and use the Using a CAS select Trace, type the value 6.4
Trace tool. and press enter.

Distance = 11.07 km
Note: the scatter plot must be turned off.

858 Jacaranda Maths Quest 10 + 10A


c. Predictions that use interpolation Answer a is reliable because it uses
are reliable and predictions that use interpolation.
extrapolation are not reliable. Answer b is not reliable because it uses
extrapolation.

Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)

Exercise 13.4 Linear regression using technology (10A)


Individual pathways
PRACTISE CONSOLIDATE MASTER
1, 2, 3, 6, 11, 14 4, 7, 9, 12, 15 5, 8, 10, 13, 16

To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.

Fluency
1. WE 7 The following data shows the amount of time athletes spent training in preparation for a marathon and
their finishing position in the race. Display the data on a scatter plot using technology.

Time, hours (x) 25 30 35


Finishing position (y) 15 11 8

2. WE 8 The following data shows the number of visitors to a store in a day and the profit of the store that day.
Determine the equation of the regression line and display the regression line using either a spreadsheet or a
CAS calculator.
Number of visitors (x) 80 85 94 101
Profit, dollars (y) 152 164 180 200

3. WE 9 Use the data from question 2 to answer the following questions.


a. Use technology to predict the profit if there were 90 visitors.
b. Use technology to predict the profit if there were 200 visitors. Round your answer to the nearest dollar.
c. Explain if these predictions are reliable.

4. The following data shows how far away students live from school in kilometers and the hours those students
spend in a car per week.

Distance from school, kilometres (x) 2 2.5 3 5


Hours spent in a car each week (y) 2.2 2.8 2.9 3.4

a. Use technology to display the data on a scatter plot.


b. Use technology to determine the equation of the regression line and display the regression line.
c. Use technology to predict the hours spent in a car each week for a student that lives 6 km from school.
Give your answer to one decimal place.

TOPIC 13 Bivariate data 859


d. Use technology to predict the hours spent in a car each week for a student that lives 2.2 km from school.
Give your answer to one decimal place.
e. Explain if these predictions are reliable.

5. The following data shows how many thousands of bees are in a hive and the amount of honey produced in
that hive per year:

Number of bees ( × 1000) (x) 11 16 24 31


Honey produced, kilograms per year (y) 15 18 22 24

a. Use technology to display the data on a scatter plot.


b. Use technology to determine the equation of the regression line and
display the regression line.
c. Use technology to predict the honey produced per year if a hive had
35 000 bees. Round the prediction to the nearest whole number.
d. Use technology to predict the honey produced per year if a hive had
18 000 bees. Round the prediction to the nearest whole number.
e. Explain if these predictions are reliable.

Understanding
6. The following data shows the temperature on certain days of the year. The days have been numbered like
this: 1 January is 1, 2 January is 2 and so on for 365 days. Assume it is a non-leap year.

Maximum temperature, °C (y)


Day of the year (x) 1 5 12 20
32 33 38 42

a. Use technology to determine the equation of the regression line.


b. Use technology to predict the temperature on 15 March (day 75 of the year). Give your answer to one
decimal place.
c. Explain if your answer to part b makes sense. Within your answer use the word extrapolation.

7. Chantal is a big fan of the Dugongs baseball team. The following data shows the number of games Chantal
watched per year and the games won by the Dugongs per year.

Number of games watched per year (x) 8 12 16 20


Number of games won by the Dugongs (y) 10 11 15 16

a. Use technology to determine the equation of the regression line.


b. Use technology to predict the number of games won if Chantal watches 15 games. Give your answer to
the nearest whole number.
c. Explain if your answer to part b makes sense. Explain your answer in terms of whether a fan watching a
sports game has a connection to the outcome.

860 Jacaranda Maths Quest 10 + 10A


8. Below is data from four lawn mowing companies showing yard size and the cost

Fred’s mowing: 200 m2 yard for $80


of their most recent lawn mowing jobs:

Dial-a-gardner: 150 m2 yard for $75


Chopper chops limited: 50 m2 yard for $60
Landscapes-r-us: 400 m2 yard for $120
a. Organise the data into a table and assign the x and y values.
b. Explain how you have assigned the x and y values.
c. Use technology to determine the equation of the regression line for this data.
d. Based on your equation for the regression line from part c, estimate the
call-out fee for lawn mowing.

9. MC Sally Miles is a world-famous pop star. By analysing Sally’s tour data, the equation for a regression line

y = 22x + 25144. In the equation of the regression line, the number 22 represents:
is found that relates numbers of fans at a concert (x) to Sally’s earning (y). The regression line equation is

A. The amount Sally earns per fan at the concert.


B. How much Sally earns per year.
C. How much Sally would earn if she had 10 fans at her concert.
D. The number of songs in Sally’s playlist.
E. How much Sally earns per month.
10. MC Regression lines are only valid for data where the independent variable has a connection to the

dependent variable. Select which of the following would NOT have a valid regression line.
A. Number of growing days and height of a sunflower.
B. Average top speed of cars and years since cars were invented.
C. Number of ice creams purchased and the price of ice cream.
D. Amount of cheese eaten per capita and the number of injuries in an AFL season.
E. Amount of cheese eaten per capita and the price of cheese.

Reasoning
11. Two friends, Yousef and Gavin, were having an eating competition. In the competition they both ate one,
then two, then three apples and recorded their time. These were the results:
Yousef
Number of apples eaten (x) 1 2 3
Time taken, seconds (y) 46 67 124

Gavin
Number of apples eaten (x) 1 2 3
Time taken, seconds (y) 38 75 112

a. Use technology to determine the equations of the regression lines for each set of data.
b. Identify the gradients for each set of data.
c. Compare the gradients. Explain what this information shows.
d. Explain who won the apple-eating contest using the data.
12. The following data shows the time it took for five people to complete one lap of a BMX track.

Age, years (x) 13 14 15 16 25


Time, minutes (y) 2.5 2.2 1.9 1.8 1.2

TOPIC 13 Bivariate data 861


a. Determine the equation of the regression line for this data.
b. The outlier in this data is the point (25, 1.2 ). Remove this piece of data and find the equation of the new
regression line.
c. Explain the impact on the equation of the regression line after removing an outlier. In your answer refer
to the gradient and y-intercept.
13. The following data shows the time it took for packages of different weights to arrive in the post:

Weight, kilograms (x) 0.5 1.1 1.7 2.5 18


Time, days (y) 4 3 3 2 4

a. Determine the equation of the regression line for this data.


b. Use technology to draw the scatter plot for this data. Explain whether there is a correlation. Determine
the type, direction and strength of the correlation.
c. There is an outlier in this data. Remove the outlier and determine the new regression line.
d. Use technology to draw the scatter plot for the data with the outlier removed. Explain whether there is a
correlation; if so, describe the type, direction and strength.
e. Explain which regression line more accurately represents the data.

Problem solving
14. At the school athletics carnival Mr. Wall was in charge of recording the student year levels and jump heights
for the winning jumps.

Year level (x) 7 8 9 10 11 12


Winning jump height, cm (y) 110 122 126 130 139

Mr Wall knows the regression line for this data is y = 4.91x + 79.3. Calculate the missing jump height.

15. Use the following data to answer these questions:

x 1 2 3 4
y 3 5 9 11

a. Determine the equation of the regression line for this data.


b. Determine what happens to the equation of the regression line if you double all the y values.

becomes y = 2.8x + 5. Show your working.


c. Determine the number that could be added to each y-value in the original data so that the regression line

a. Determine four data points that give a regression line equation of y = 8x + 3.


16. Answer the folllowing questions with full working.

b. Now determine four different data points that still give a regression line equation of y = 8x + 3.

862 Jacaranda Maths Quest 10 + 10A


13.5 Time series
LEARNING INTENTION
At the end of this subtopic you should be able to:
• describe time series data using trends and patterns
• draw lines of best fit by eye and use it to make predictions for a time series

13.5.1 Describing time series


eles-4973
• Time series are a type of bivariate data with time as the independent variable. In other words, time series
show time on the x-axis.
• To describe time series data use trends and patterns.
• Time series trends can be:
• increasing or decreasing
• linear or non-linear.
Increasing linear time series Decreasing non-linear time series
An upwards slope from left to right with A downwards slope from left to right with data that is not
data that is approximately a straight line. a straight line.
y
33.0 150
Pages unread

32.5
Temp. (°C)

32.0 100
31.5
31.0 50
30.5
30.0
0 1 2 3 4 5 6 7 8 9 10 x 0 2 4 6 8 10 12 14 16 x
Time (hours) Time (days)

• Time series patterns can be:


• seasonal
• cyclical
• random.
Seasonal The pattern repeats over a period
of time such as a day, week, month 16 Cycle peaks every 12 months
or year. 14
Houses sold

12
10
8
6
4
2
0 x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2003 2004 2005
Time
(continued)

TOPIC 13 Bivariate data 863


Cyclical Rises and falls happen over
different periods of time. 400 No regular periods between peaks
350

products sold
300

Software
250
200
150
100
50
0
x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2003 2004 2005
Time
Random No regular pattern and caused by
unpredictable events. 30

Profits
26
22
18
14
0 2 4 6 8 10 12 14 16 x
Time

WORKED EXAMPLE 10 Classifying the time series trend

Classify the trend suggested by the time series graph shown as being linear or non-linear, and
upward, downward or no trend.
Data

Time

THINK WRITE
Carefully analyse the given graph and The time series graph does not resemble a straight
comment on whether the graph resembles line and overall the level of the variable, y, decreases
a straight line or not and whether the over time. The time series graph suggests a non-linear
values of y increase or decrease over time. downward trend.

WORKED EXAMPLE 11 Commenting on the time series trend

The data below show the average daily mass of a person (to the nearest 100 g), recorded over the
28-day period.
63.6, 63.8, 63.5, 63.7, 63.2, 63.0, 62.8, 63.3, 63.1, 62.7, 62.6, 62.5, 62.9, 63.0,
63.1, 62.9, 62.6, 62.8, 63.0, 62.6, 62.5, 62.1, 61.8, 62.2, 62.0, 61.7, 61.5, 61.2
a. Plot these masses as a time series graph.
b. Comment on the trend.

864 Jacaranda Maths Quest 10 + 10A


THINK WRITE/DRAW
a. 1. Draw the points on a scatterplot with time on a. y
the horizontal axis and mass on the vertical 64.0
axis. 63.8
2. Join the points with straight line segments to 63.6
63.4
create a time series plot.
63.2
63.0
62.8

Mass (kg)
62.6
62.4
62.2
62.0
61.8
61.6
61.4
61.2
61.0
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 x
Time (days)
b. Carefully analyse the given graph and b. The graph resembles a straight line that slopes
comment on whether the graph resembles a downwards from left to right (that is, mass
straight line or not and whether the values decreases with increase in time). Although a
of y (in this case, mass) increase or decrease person’s mass fluctuates daily, the time series
over time. graph suggests a downward trend. That is,
overall, the person’s mass has decreased over
the 28-day period.

13.5.2 Time series lines of best fit by eye


eles-4974
• It is possible to draw lines of best fit by eye for time series (see subtopic 13.3).
• Lines of best fit can be used to make predictions.
• For interpolations, the predictions are reliable.
• For extrapolations, the predictions are not very reliable since there is an assumption that the trend
will continue.

WORKED EXAMPLE 12 Making predictions using a line of best fit

The graph at right shows the average cost of y


renting a one-bedroom flat, as recorded over a 300
10-year period. 280
Cost of rent ($)

a. If appropriate, draw in a line of best fit and 260


240
comment on the type of the trend.
220
b. Assuming that the current trend will continue,
200
use the line of best fit to predict the cost of rent
180
in 5 years’ time. 160
140
0 1 5 10 15 x
Time (years)

TOPIC 13 Bivariate data 865


THINK WRITE/DRAW
a. 1. Analyse the graph and observe what a. y
occurs over a period of time. Draw a 300
line of best fit. 280

Cost of rent ($)


260
240
220
200
180
160
140
0 1 5 10 15 x
Time (years)
2. Comment on the type of trend The graph illustrates that the cost of rent
observed. increases steadily over the years. The time
series graph indicates an upward linear trend.

b. 1. Extend the line of best fit drawn in b. y


300
part a. The last entry corresponds to
280
the 10th year and we need to predict
Cost of rent ($)

260
the cost of rent in 5 years’ time; that 240
is, in the 15th year. 220
2. Locate the 15th year on the time 200
axis and draw a vertical line until it 180
meets with the line of best fit. From 160
the trend line (line of best fit) draw a 140
horizontal line to the cost axis. 0 1 5 10 15 x
Time (years)

3. Read the cost from the vertical axis. Cost of rent = $260
4. Write the answer. Assuming that the cost of rent will continue to

expect the cost of rent to reach $260 per week.


increase at the present rate, in 5 years we can

DISCUSSION
Why are predictions in the future appropriate for time series even though they involve extrapolation?

Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Video eLesson Fluctuations and cycles (eles-0181)
Interactivity Individual pathway interactivity: Time series (int-4628)

866 Jacaranda Maths Quest 10 + 10A


Exercise 13.5 Time series
Individual pathways
PRACTISE CONSOLIDATE MASTER
1, 4, 8, 11 2, 5, 7, 9, 12 3, 6, 10, 13

To answer questions online and to receive immediate corrective feedback and fully worked solutions for all
questions, go to your learnON title at www.jacplus.com.au.

Fluency
1. WE10 For questions 1 and 2, classify the trend suggested by each time series graph as being linear or
non-linear, and upward, downward or stationary in the mean (no trend).
a. Data b. Data

Time Time
c. Data d. Data

Time Time

2. a. Data b. Data

Time Time
c. Data d. Data

Time Time

3. WE11 The data below show the average daily temperatures recorded in June.
17.6, 17.4, 18.0, 17.2, 17.5, 16.9, 16.3, 17.1, 16.9, 16.2, 16.0, 16.6, 16.1, 15.4, 15.1,
15.5, 16.0, 16.0, 15.4, 15.2, 15.0, 15.5, 15.1, 14.8, 15.3, 14.9, 14.6, 14.4, 15.0, 14.2
a. Plot these temperatures as a time series graph.
b. Comment on the trend.

TOPIC 13 Bivariate data 867


Understanding
4. The data below show the quarterly sales (in thousands of dollars) recorded by the
owner of a sheepskin product store over a period of 4 years.

Quarter 2006 2007 2008 2009


1 57 59 50 52
2 100 102 98 100
3 125 127 120 124
4 74 70 72 73

a. Plot the time series.


b. The time series plot displays seasonal fluctuations of period 4 (since there are four quarters). Explain in
your own words what this means. Also write one or two possible reasons for the occurrence of
these fluctuations.
c. Determine if the time series plot indicate upward, downward or no trend.

5. The table below shows the total monthly revenue (in thousands of dollars) obtained by the owners of a large
reception hall. The revenue comes from rent and catering for various functions over a period of 3 years.

Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
2007 60 65 40 45 40 50 45 50 55 50 55 70
2008 70 65 60 65 55 60 60 65 70 75 80 85
2009 80 70 65 70 60 65 70 75 80 85 90 100

a. Construct a time series plot for this data.


b. Describe the graph (peaks and troughs, long-term trend, any other patterns).
c. Suggest possible reasons for monthly fluctuations.
d. Explain if the graph shows seasonal fluctuations over 12 months. Discuss any patterns that repeat from
year to year.
6. The owner of a motel and caravan park in a small town keeps records of the total number of rooms and total
number of camp sites occupied per month. The time series plots based on his records are shown below.
Number of rooms/sites occupied

90
80
70
Camp sites
60
50 Motel rooms
40
30
20
10

0 Jan. Apr. Aug. Dec.


Month

a. Describe each graph, discussing general trend, peaks and troughs and so on. Explain particular features
of the graphs and give possible reasons.
b. Compare the two graphs and write a short paragraph commenting on any similarities and differences
between them.

868 Jacaranda Maths Quest 10 + 10A


7. WE12 The graph shows enrolments in the Health and Nutrition
120
course at a local college over a 10-year period.
110
a. If appropriate, draw in a line of best fit and comment on the 100
type of the trend. 90
b. Assuming that the trend will continue, use the line of best fit to 80

Enrolment
predict the enrolment for the course in 5 years’ time; that is, in 70
the 15th year. 60
50
40
30
20
10

0 1 2 3 4 5 6 7 8 9 10
Time (years)

Reasoning
8. In June a new childcare centre was opened. The number of children attending full time (according to the
enrolment at the beginning of each month) during the first year of operation is shown in the table.

June July Aug. Sept. Oct. Nov. Dec. Jan. Feb. Mar. Apr. May
6 8 7 9 10 9 12 10 11 13 12 14

a. Plot this time series (Hint: Let June = 1, July = 2 etc.)


b. Justify if the childcare business is going well.
c. Draw a line of best fit.
d. Use your line of best fit to predict the enrolment in the centre during the second year of operation at the
beginning of:
i. August
ii. January.
e. State any assumptions that you have made.

9. The graph shows the monthly sales of a certain book since its publication. Explain in your own words why
linear trend forecasting of the future sales of this book is not appropriate.

Sales

Time

10. In the world of investing this phrase is commonly used when talking about investments:
“Past performance is not an indicator of future returns.”
a. Explain what this phrase means.
b. Explain why this phrase is true using the term extrapolation.

TOPIC 13 Bivariate data 869


Problem solving
11. In Science class Melita boiled some water and then recorded the temperature of the water over ten minutes.
These are her results:
Time (minutes) 0 1.5 3 4.5 6 7.5 10
Temperature (°C) 100 95 88 74 65 60 52

a. Melita wants to convert the time from minutes into seconds. She starts by converting 1.5 minutes to
150 seconds. Explain what she did wrong and find the correct number in seconds.
b. Copy and complete the table, changing the time in minutes to time in seconds.
c. Draw a scatter plot using seconds as the time scale.
d. Draw a line of best fit and use it to predict the time, in seconds, when the water will reach 20°C.
e. Convert your answer from part e back to minutes.
12. The table below gives the quarterly sales figures for a second-hand car dealer over a three-year period.

Year Q1 Q2 Q3 Q4
2012 75 65 92 99
2013 91 79 115 114
2014 93 85 136 118

a. Represent this data on a time series plot.


b. Briefly describe how the car sales have altered over the time period.
c. Discuss if it appears that the car dealer can sell more cars in a particular period each year.

13. Jasper owns an ice-cream truck.


• In summer 2020/21 he sold 1536 ice-creams.
• In Autumn 2021 he sold one-quarter of that number.
• In Winter 2021 he sold one-eighth of that number
• In Spring 2021 he sold one-third of that number
• From Summer 2021/22 until Spring 2022 his sales doubled from the season in the previous year.

a. Represent his sales from Summer 2020/21 to Spring 2022 on a scatter plot.
b. Describe the trend and the patterns in the data.

870 Jacaranda Maths Quest 10 + 10A


13.6 Review
13.6.1 Topic summary

BIVARIATE DATA

Properties of bivariate data Representing the data Correlations from scatter plots
• Bivariate data can be displayed, • Scatter plots can be created • Correlation is a way of
analysed and used to make predictions. by hand or using technology describing a connection between
• Types of variables: (CAS or Excel). variables in a bivariate data set.
• Independent (experimental or • The independent variable is • Correlation between the two
explanatory variable): not placed on the x-axis and the variables will have:
impacted by the other variable. dependent variable on the y-axis. • a type (linear or non-linear)
• Dependent (response variable): • a direction (positive or
impacted by the other variable. Dependent data on y-axis negative)
y • a strength (strong, moderate or
weak).
Time series scatter plots Data points plotted
Average student marks (%)

100
• Correlations can be used to make
75 conclusions.
time is the independent variable. • There is no correlation if the data are
• Describe time series by: 50 spread out across the plot with no
• trends clear pattern.
25
• patterns. Independent data on x-axis
• Time series patterns can be: x
0
• seasonal 1 2 3 4
Time spent on phone per day (hours)
• cyclical
• random.
Interpolation and extrapolation
33.0
• A line that follows the trend of • Interpolation and extrapolation can
Temp. (°C)

32.5
32.0 the data in a scatter plot. be used to make predictions.
31.5
31.0 • It is most appropriate for • Interpolation:
30.5 data with strong or moderate • is more reliable from a large
30.0 t linear correlation. number of data
0 1 2 3 4 5 6 7 8 9 10
Hours • Can be sketched as a line of best • used if the prediction sits within
Cycle peaks every 12 months
the given data.
12 using technology. • Extrapolation:
• The equation for the line can • assumes the trend will continue
Houses sold

10
8 be found by using the gradient • used if the prediction sits outside
6
4 and equation of the straght line. the given data.
2 • The line can be used to make
0 Predictions
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 t predictions. made within
y
2003 2004 2005 • Regression lines are only valid if the data use
Predictions
Average student marks (%)

the independent and dependent 100 interpolation.


made outside
No regular periods between peaks variables have a connection. the data use
Software products sold

300 75 extrapolation.
250
200
50
150
Average student marks (%)

100 100
50 25
0 75
Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4Q 1Q 2Q 3Q 4 t
0 1 2 3 4 5
2003 2004 2005 50 Time spent on phone per day (hours)
30 25
26
Profits

22
18 0 1 2 3 4
14
0 2 4 6 8 10 12 14 16 18 20 22 24 t Time spent on phone per day (hours)

TOPIC 13 Bivariate data 871


13.6.2 Success criteria
Tick the column to indicate that you have completed the subtopic and how well you have understood it using the
traffic light system.
(Green: I understand; Yellow: I can do it with help; Red: I do not understand)

Subtopic Success criteria

13.2 I can recognise the independent and dependent variables in bivariate data.

I can represent bivariate data using a scatter plot.

I can describe the correlation between two variables in a bivariate data set.

I can draw conclusions about the correlation between two variables in a


bivariate data set.

13.3 I can draw a line of best fit by eye.

I can determine the equation of the line of best fit.

I can use the line of best fit to make predictions.

I can identify if a prediction is interpolation or extrapolation.

13.4 I can display a scatter plot using technology.

I can determine the equation of the regression line using technology.

I can display a scatter plot with its regression line using technology.

I can use a regression line to make predictions.

13.5 I can describe time series data using trends and patterns.

I can draw a line of best fit by eye and use it to make predictions for a
time series.

13.6.3 Project
Collecting, recording and analysing data over time

A time series is a sequence of measurements taken at regular intervals (daily,


weekly, monthly and so on) over a certain period of time. Time series are
best represented using time-series plots, which are line graphs with the time
plotted on the horizontal axis.
Examples of time series include daily temperature, monthly unemployment
rates and daily share prices.

872 Jacaranda Maths Quest 10 + 10A


When data are recorded on a regular basis, the value of the variable may go up and down in what seems
to be an erratic pattern. These are called fluctuations. However, over a long period of time, the time series
usually suggests a certain trend. These trends can be classified as being linear or non-linear, and upward,
downward or stationary (no trend).
Time series are often used for forecasting, that is, making predictions about the future. The predictions
made with the help of time series are always based on the assumption that the observed trend will continue
in the future.
1. Choose a subject that is of interest to you and that can be observed and measured during one day or over
the period of a week or more. (Suitable subjects are shown in the list below.)
2. Prepare a table for recording your results. Select appropriate regular time intervals. An example is
shown below.

Time 8 am 9 am 10 am 11 am 12 pm 1 pm 2 pm 3 pm 4 pm 5 pm
Pulse rate

3. Take your measurements at the selected time intervals and record them in the table.
4. Use your data to plot the time series. You can use software such as Excel or draw the scatterplot by hand.
5. Describe the graph and comment on its trend.
6. If appropriate, draw a line of best fit and predict the next few data values.
7. Take the actual measurements during the hours you have made predictions for. Compare the predictions
with the actual measurements. Were your predictions good? Give reasons.
Here are some suitable subjects for data observation and recording:
• minimum and maximum temperatures each day for 2 weeks (use the TV news or online data as
resources)
• the value of a stock on the share market (e.g. Telstra, Wesfarmers and Rio Tinto)
• your pulse over 12 hours (ask your teacher how to do this or check on the internet)
• the value of sales each day at the school canteen
• the number of students absent each day
• the position of a song in the Top 40 over a number of weeks
• petrol prices each day for 2 weeks
• other measurements (check with your teacher)
• world population statistics over time.

Resources
Resourceseses
eWorkbook Topic 13 Workbook (worksheets, code puzzle and project) (ewbk-2039)
Interactivities Crossword (int-2887)
Sudoku puzzle (int-3600)

TOPIC 13 Bivariate data 873


Exercise 13.6 Review questions
To answer questions online and to receive immediate corrective feedback and fully worked solutions for
all questions, go to your learnON title at www.jacplus.com.au.

Fluency
1. As preparation for a Mathematics test, a group of 20 students was given a revision sheet containing 60
questions. The table below shows the number of questions from the revision sheet successfully
completed by each student and the mark, out of 100, of that student on the test.
Number of questions 9 12 37 60 55 40 10 25 50 48 60
Test result 18 21 52 95 100 67 15 50 97 85 89
Number of questions 50 48 35 29 19 44 49 20 16 58 52
Test result 97 85 62 54 30 70 82 37 28 99 80

a. State which of the variables is dependent and which is independent.


b. Construct a scatterplot of the data.
c. State the type of correlation between the two variables suggested by the scatterplot and draw a
corresponding conclusion.
d. Suggest why the relationship is not perfectly linear.

2. Use the line of best fit shown on the graph to answer the following y
questions. 50
a. Predict the value of y, when the value of x is: 45
40
i. 10 ii. 35.
35
b. Predict the value of x, when the value of y is:
30
i. 15 ii. 30.
25
c. Determine the equation of a line of best fit if it is known that it passes 20
through the points (5, 5) and (20, 27). 15
d. Use the equation of the line to algebraically verify the values 10
obtained from the graph in parts a and b. 5

0 x
5 10 15 20 25 30 35 40

3. The graph shows the number of occupants of a y


large nursing home over the last 14 years. 130
a. Comment on the type of trend displayed. 120
Number of occupants

110
b. Explain why it is appropriate to draw in a line
100
of best fit.
90
c. Draw a line of best fit and use it to predict the
80
number of occupants in the nursing home in 3 70
years time. 60
d. State the assumption that have been made when 50
predicting figures for part c. 40

x
19 6

20 9
20 0
20 1
20 2
20 3
20 4
20 5
20 6
20 7
20 8
09
97

19 8
9

9
0
0
0
0
0
0
0
0
0
9
19

19

Time (Year)

874 Jacaranda Maths Quest 10 + 10A


4. The table below shows the advertised sale price ($′ 000) and the land size m2 for ten vacant blocks of
( )

land.
( )
Land size m2 Sale price ($'000)
632 36
1560 58
800 40
1190 44
770 41
1250 52
1090 43
1780 75
1740 72
920 43

a. Construct a scatterplot and determine the equation of the line of best fit.
b. State what the gradient represents.
c. Using the line of best fit, predict the approximate sale price, to the nearest thousand dollars for a
block of land with an area of 1600 m2 .

could purchase with $50 000.


d. Using the line of best fit, predict the approximate land size, to the nearest 10 square metres, you

5. The table below shows, for fifteen students, the amount of pocket money they receive and spend at the
school canteen in an average week.

Pocket money ($) Canteen spending ($)


30 16
40 17
15 12
25 14
40 16
15 14
30 16
30 17
25 15
15 13
50 19
20 14
35 17
20 15
10 13

a. Construct a scatterplot and determine the equation of the line of best fit.
b. State what the gradient represents.

$45 pocket money a week.


c. Using your line of best fit, predict the amount of money spent at the canteen for a student receiving

receives $100 pocket money each week. Explain if this seems reasonable.
d. Using your line of best fit, predict the amount of money spent at the canteen by a student who

TOPIC 13 Bivariate data 875


6. The table below shows, for 10 ballet students, the number of hours a week spent training and the
number of pirouettes in a row they can complete.

Training (hours) 11 11 2 8 4 16 11 16 5 3
Number of pirouettes 15 13 3 12 7 17 13 16 8 5

a. Construct a scatterplot and determine the


equation of the line of best fit.
b. State what the gradient represents.
c. Using your line of best fit, predict the
number of pirouettes that could be
completed if a student undertakes
14 hours of training.
d. Professional ballet dancers may undertake
up to 30 hours of training a week. Using
your line best fit, predict the number of
pirouettes they should be able to do in a
row. Comment on your findings.

7. Use the information in the data table to answer the following questions.

Age in years (x) 7 11 8 16 9 8 14 19 17 10 20 15


Hours of television
20 19 25 55 46 50 53 67 59 25 70 58
watched in a week (y)

a. Use technology to determine the equation of the line of best fit for the following data.
b. Use technology to predict the value of the number of hours of television watched by a person
aged 15.

Problem solving
8. Describe the trends present in the following time series data that shows the mean monthly daily hours
of sunshine in Melbourne from January to December.

Month 1 2 3 4 5 6 7 8 9 10 11 12
Daily hours of sunshine 8.7 8.0 7.5 6.4 4.8 4.0 4.5 5.5 6.3 7.3 7.5 8.3

9. The existence of the following situations is often considered an obstacle to making estimates from data.
a. Outlier.
b. Extrapolation.
c. Small range of data.
d. Small number of data points.
Explain why each of these situations is considered an obstacle to making estimates of data and how
each might be overcome.

876 Jacaranda Maths Quest 10 + 10A


10. The table shows the heights of 10 students and the distances along the ground between their feet as they
attempt to do the splits.

Height (cm) Distance stretched (cm)


134.5 150
156 160
133.5 147
145 160
160 162
135 149
163 163
138 149
152 158
159 160

Using the data, estimate the distance a person 1.8 m tall can achieve when attempting the splits. Write a
detailed analysis of your result. Include:
• an explanation of the method(s) used
• any plots or formula generated
• comments on validity of the estimate
• any ways the validity of the estimate could be improved.

To test your understanding and knowledge of this topic, go to your learnON title at
www.jacplus.com.au and complete the post-test.

TOPIC 13 Bivariate data 877


Online Resources Resources

Below is a full list of rich resources available online for this topic. These resources are designed to bring ideas to life,
to promote deep and lasting learning and to support the different learning needs of each individual.

eWorkbook Teacher resources


Download the workbook for this topic, which includes There are many resources available exclusively for teachers
worksheets, a code puzzle and a project (ewbk-2039) ⃞ online.

Solutions
Download a copy of the fully worked solutions to every
question in this topic (sol-0747) ⃞

Digital documents
13.2 SkillSHEET Substitution into a linear rule (doc-5405) ⃞
SkillSHEET Solving linear equations that arise when
finding x- and y-intercepts (doc-5406) ⃞
SkillSHEET Transposing linear equations to standard
form (doc-5407) ⃞
SkillSHEET Measuring the rise and the run (doc-5408) ⃞
SkillSHEET Determining the gradient given two points
(doc-5409) ⃞
SkillSHEET Graphing linear equations using the x- and
y-intercept method (doc-5410) ⃞
SkillSHEET Determining independent and dependent
variables (doc-5411) ⃞
SkillSHEET Determining the type of relationship
(doc-5413) ⃞

Video eLessons
13.3 Bivariate data (eles-4965) ⃞
Correlation (eles-4966) ⃞
Drawing conclusions from correlation (eles-4967) ⃞
13.4 Lines of best fit by eye (eles-4968) ⃞
Predictions using lines of best fit (eles-4969) ⃞
13.5 Scatter plots using technology (eles-4970) ⃞
Regression lines using technology (eles-4971) ⃞
Using regression lines to make predictions (eles-4972) ⃞
13.6 Describing time series (eles-4973) ⃞
Time series lines of best fit by eye (eles-4974) ⃞
Fluctuations and cycles (eles-0181) ⃞

Interactivities
13.2 Individual pathway interactivity: Bivariate data
(int-4626) ⃞
13.3 Individual pathway interactivity: Lines of best
fit (int-4627) ⃞
Lines of best fit (int-6180) ⃞
Interpolation and extrapolation (int-6181) ⃞
13.5 Individual pathway interactivity: Time series (int-4628) ⃞
13.6 Crossword (int-2887) ⃞
Sudoku puzzle (int-3600) ⃞

To access these online resources, log on to www.jacplus.com.au.

878 Jacaranda Maths Quest 10 + 10A


Answers 3.
4.6
y

Topic 13 Bivariate data 4.4


4.2
Exercise 13.1 Pre-test 4.0
1. B 3.8
3.6
2. C
3.4

Cost ($1000)
3. Independent variable 3.2
4. a. B b. C c. A 3.0
5. D 2.8
6. B
2.6
2.4
7. Interpolation
2.2
16 2.0
8. The gradient of the line is .
11 1.8
9. Independent variable 1.6
10. Explanatory variable 1.4
11. B 0 30 40 50 60 70 80 90100110 120
x

x=6
12. C Number of guests
13. 4. a. Perfectly linear, positive
14. a. See figure at the bottom of the page.* b. No correlation
b. The number of COVID-19 cases started rising in March c. Non-linear, negative, moderate
and peaked in April, then started to decline until June.
d. Strong, positive, linear
There was an increase in cases in July and the cases
reached peak again in August. Cases then started to e. No correlation

a. y = 14 b. x = 12 · 5
decline again until December. 5. a. Non-linear, positive, strong
15. b. Strong, negative, negative
c. Non-linear, moderate, negative
Exercise 13.2 Bivariate data d. Weak, negative, linear
1. e. Non-linear, moderate, positive
Independent Dependent
6. a. Positive, moderate, linear
a. Number of hours Test results
b. Non-linear, strong, negative
b. Rainfall Attendance
c. Strong, negative, linear
c. Hours in gym Visits to the doctor
d. Weak, positive, linear
d. Lengths of essay Memory taken
e. Non-linear, moderate, positive

2. Independent Dependent
a. Cost of care Attendance
b. Age of property Cost of property
c. Number of applicants Cut-off OP score
d. Running speed Heart rate

*14. a. y
400
New Covid-19 cases

350
300
250
200
150
100
50

0 x
ch

il

ay

er

r
Ju

us

be

be

be
Ju
pr

ob
ar

ug

em

em

em
A
M

ct
A

pt

ov

ec
Se

D
N

Month

TOPIC 13 Bivariate data 879


7. a. y 9. a. y
12 100
11 90

Total score (%)


10 80
Number of bags sold

9 70
8 60
7 50
6 40
5 30
4 20
3 10
2 x
0 1 2 3 4 5 6 7 8 9 10
1
Number of questions completed
0 x
30 35 40 45 50 55 60 65 70 75 80 b. Strong, positive, linear correlation
Cost ($)
c. Various answers; some students are of different ability
b. Negative, linear, moderate. The price of the bag levels and they may have attempted the questions but had
appeared to affect the numbers sold; that is, the more incorrect answers.
expensive the bag, the fewer sold.
10. a. y

Number of accidents
8. a. y 6
420 5
400 4
380 3
360 2
340 1
Price ($1000)

320
0 x
300 5 10 15 20 25 30 35 40
280 Number of lessons
260 b. Weak, negative, linear relation
240 c. Various answers; some drivers are better than others, live
220 in lower traffic areas, traffic conditions etc.
200
11. B
180
160 12. C
140 13. D

0 x 14. a. See figure at the bottom of the page.*


1 2 3 4 5 6 7
Number of bedrooms This scatterplot does not support the claim.
b.
15. a. T b. F c. T d. F e. T
b. Moderate positive linear correlation. There is evidence to
show that the larger the number of bedrooms, the higher 16. a. Mandy (iii) b. William (iv) c. Charlotte (viii)
the price of the house. d. Dario (vii) e. Edward (vi) f. Cindy (v)
c. Various answers; location, age, number of people g. Georgina (i) h. Harrison (ii)
interested in the house, and so on.

*14. a. y
12
A C
10
Number of handballs

G
8
H
6 D
E
4
B
2
F
0 x
2 4 6 8 10 12 14 16 18 20 22 24 26
Number of kicks

880 Jacaranda Maths Quest 10 + 10A


Exercise 13.3 Lines of best fit by eye 5. a. C
Note: Answers may vary slightly depending on the line of best 165
fit drawn. 160
1. a
155
P
14 150
13 145
12 140
11 135
10 130

Cost of food ($)


Petrol used (L)

9 125
8 120
7 115
6 110
5 105
4 100
3 95
2 90
1 85
80
0 d 75
10 20 30 40 50 60 70 80 90 100
Distance travelled (km)

Using, (23, 3) and (56, 8), the equation is P = d− .


70
5 16 65
b. n
0 1 2 3 4 5 6 7
33 33 Number of people
2. a. 38 b. 18

C = 18.75n + 56.25
b. Using (1, 75) and (5, 150), the equation is
3. a. i. 510 ii. 315 iii. 125
c. On average, weekly cost of food increases by $18.75 for
y = −13x + 595
b. i. 36.5 ii. 26 iii. 8

d. i. $206.25 ii. $225.00 iii. $243.75


c. every extra person.
d. y-values (a):
i. 594 6. a. 35 to 90
ii. 309 b. 30 to 105
iii. 127 7. a. M
x-values (b): 3.6
i. 36.54 3.4
ii. 25.55 3.2
iii. 5.86 3.0
2.8
4. a E 2.6
Mass (kg)

140
2.4
130
2.2
120
2.0
110
1.8
100
1.6
Earnings ($)

90
1.4
80
1.2
70
1.0
60
0.8
50 t
30 31 32 33 34 35 36 37 38 39 40
40 Time (weeks)

Using (32, 1.470) and (35, 2.230), M = 0.25t − 6.5.


30 Positive, strong, linear correlation
20
b.
10
c. With every week of gestation the mass of the baby
0 2 4 6 8 10 12 14 16 18 h increases by approximately. 250 g.

Using (8, 47) and (12, 74), the equation is E = 6.75h − 7.


Hours worked d. 3.75 kg; 4 kg

c. On average, students were paid $6.75 per hour.


b. e. Approximately 1 kg
f. Between 35 and 36 weeks
8. a. D b. C
9. E

TOPIC 13 Bivariate data 881


10. a. M 12. a.
18 8

Best jump (metres)


17 7
6
16
5
15
4
14
3
13
2
12 1
11
Mass (g)

10 0 1 7 8 9 10 11 12 13 14 15 16 17 18 19 20
9 Age
8 b.
7 8

Best jump (metres)


6 7
6
5
5
4
4
3
3
2 2
1 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 d 0 1 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Using (2, 3.7) and (10, 11.2), M = 0.88d + 1.94.


Day Age
b. c. B
c. Each day Rachel’s crystal gains 0.88 g in mass. Line of 8

Best jump (metres)


7
best fit appears appropriate.
6
d. 7.22 g; 8.10 g; 13.38 g and 14.26 g; interpolation (within 5
the given range of 1−16) 4
e. 16.9 g and 17.78 g; predictions are not reliable, since 3
they were obtained using extrapolation. 2

L = 1.07n + 18.9
1
11. a. See figure at the bottom of the page.*
b. 0 a
1 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Age
c. 24.25 cm; 26.39 cm; 28.53 cm; 30.67 cm; 31.74 cm;

B = 0.34a + 1.8; estimated best jump = 8.6 m.


34.95 cm; 36.02 cm; 38.16 cm; 39.23 cm d. Yes. Using points (9, 4.85) and (16, 7.24),
d. Interpolation (within the given range of 1–20)
e. 41.37 cm; 42.44 cm; 43.51 cm e. No, trends work well over the short term but in the long
term are affected by other variables.
f. Not reliable, because extrapolation has been used.

*11. a. L
40
39
38
37
36
35
34
33
32
Length (cm)

31
30
29
28
27
26
25
24
23
22
21
20
19
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 n
Week

882 Jacaranda Maths Quest 10 + 10A


24 years old: 9.97 m; 28 years old: 11.33 m. It is
f. 9. A

Yousef: y = 39x + 1, Gavin: y = 37x + 1


unrealistic to expect his jumping distance to increase 10. D
indefinitely. 11. a.
g. Equal first.
b. Yousef 39, Gavin 37
13. No. He would have to get 108% which would be impossible
c. Time to eat each apple.
on a test.

a. y = −0.094x + 3.35 b. y = −0.24x + 5.58


d. Gavin

Exercise 13.4 Linear regression using 12.

technology (10A) c. Sample responses can be found in the worked solutions

a. y = 0.0508x + 2.96
in the online resources.
1. Sample responses can be found in the worked solutions in
13.
the online resources.

c. y = −0.913x + 4.32
b. No correlation
2. Sample responses can be found in the worked solutions in

3. a. $174 b. $418
the online resources.
d. Correlation is linear, negative and moderate/strong.
c. a. Reliable, but b. not reliable. e. Regression line from part c.

a. y = 2.8x
4. a. Sample responses can be found in the worked solutions 14. 128 cm
in the online resources. 15. b. Gradient doubles.
b. Sample responses can be found in the worked solutions c. Add 5
in the online resources. 16. a. Sample responses can be found in the worked solutions
c. 3.8. hours d. 2.5 hours in the online resources.
e. c. not reliable, d. reliable
b. Sample responses can be found in the worked solutions
5. a. Sample responses can be found in the worked solutions in the online resources.
in the online resources.
b. Sample responses can be found in the worked solutions Exercise 13.5 Time series
in the online resources. 1. a. Linear, downward
c. 26 kg d. 19 kg
b. Non-linear, upward

y = 0.553x + 31
e. c. not reliable, d. reliable
c. Non-linear, stationary in the mean
6. a. b. 72.4°C
d. Linear, upward

y = 0.55x + 5.3
c. Not possible. Extrapolation not reliable.
2. a. Non-linear, downward
7. a. b. 14
b. Non-linear, downward
c. No because no connection.
c. Non-linear, downward
8. a. Sample responses can be found in the worked solutions
d. Linear, upward
in the online resources.
3. a. See figure at the bottom of the page.*
c. y = 0.173x + 49.1 d. $50
b. Yard size independent, price dependent
b. Linear downward trend

*3. a. May temperature


y
18.0
17.8
17.6
17.4
17.2
17.0
16.8
Temperature (°C)

16.6
16.4
16.2
16.0
15.8
15.6
15.4
15.2
15.0
14.8
14.6
14.4
14.2
14.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 x
Day

TOPIC 13 Bivariate data 883


4. a. y 7. a. y
130 120
125 110
120 100
115 90
110 80

Enrolment
105 70
Sales (× $1000)

100 60
95 50
90 40
85 30
80 20
75 10
70
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 y
65
60 Week

b. In the 15th year the expected amount = 122.


55 Upward linear.
50

0 x 8. a.
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
14
2006 2007 2008 2009
13

Number of children
Quarter year
12 (8, 11)
b. Sheepskin products more popular in the third quarter 11
(presumably winter) — discount sales, increase in sales, 10
and so on. 9
c. No trend. 8 (1, 7)
5. a. See figure at the bottom of the page.* 7
6
b. General upward trend with peaks around December and
5
troughs around April.
ne
ly
ug
p
ct
ov
ec
n
b
ar

M r
ay
p
Se

Ja
Fe
c. Peaks around Christmas where people have lots of
O
Ju

M
A
D
Ju

N
parties, troughs around April where weather gets colder Time (month)
and people less inclined to go out. b. Yes, the graph shows an upward trend.

y= x+
d. Yes. Peaks in December, troughs in April. 4 45
6. a. Peaks around Christmas holidays and a minor peak at c.
7 7
Easter. No camping in colder months.
d. i. 15 ii. 18
b. Sample responses can be found in the worked solutions
e. The assumption made was that business will continue on
in the online resources.
a linear upward trend.
9. The trend is non-linear, therefore unable to forecast
future sales.

*5. a. y
100

90

80
Revenue ($1000)

70

60

50

40

35
0 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 x
2007 2008 2009
Month Year

884 Jacaranda Maths Quest 10 + 10A


10. a. Sample responses can be found in the worked solutions 5. Sample responses can be found in the worked solutions in
in the online resources. the online resources. Students should describe their graph
b. Extrapolation is not reliable. and comment on its trend.
11. a. 90 seconds 6. Sample responses can be found in the worked solutions in

b. Sample responses can be found in the worked solutions the online resources. Students should draw a line of best fit
in the online resources. and predict the next few data values.
7. Sample responses can be found in the worked solutions
c. Sample responses can be found in the worked solutions
in the online resources. in the online resources. Students should take the actual
measurements during the hours they have made predictions
d. Approximately 920 seconds.
for and then compare the predictions with the actual
e. Approximately 15.3 minutes. measurements. Also comment on the accuracy of your
12. a. See bottom of the page* predictions.
b. Secondhand car sales per quarter have shown a general
upward trend but with some major fluctuations. Exercise 13.6 Review questions
c. More cars are sold in the third and fourth quarters 1. a. Number of questions: independent;
compared to the first and second quarters. test result: dependent
13. a. Sample responses can be found in the worked solutions b. y
in the online resources. 100
b. Trend: non-linear, increasing; Pattern: seasonal 90
80

Test result
Project 70
60
1. Sample responses can be found in the worked solutions in
50
the online resources. Students could choose any subject
40
given in the list that can be observed and measure for one
30
day or over the period of a week or more.
20
2. Sample responses can be found in the worked solutions in 10
the online resources. Students need to create a data table
0 x
for their recording. Students should use appropriate regular 10 15 20 25 30 35 40 45 50 55 60
time intervals. Number of questions
3. Sample responses can be found in the worked solutions in Strong, positive, linear correlation; the larger the number
c.
the online resources. For a selected subject, student’s need of completed revision questions, the higher the mark on
to take their measurements at the selected time intervals and the test.
record them in the table. d. Different abilities of the students
4. Sample responses can be found in the worked solutions in 2. a. i. 12.5 ii. 49
the online resources. Students could use Excel or CAS to
b. i. 12 ii. 22.5
plot the time series.

*12. a. y
140
135
130
125
120
115
Cars sold

110
105
100
95
90
85
80
75
70
65

0 x
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
2012 2013 2012
Quarter year

TOPIC 13 Bivariate data 885


y= x−
22 7 10. About 170 cm; data was first plotted as a scatter plot.
c.
15 3 (145, 160) was identified as an outlier and removed from the

data and its equation determined as d = 0.5h + 80, where


data set. A line of best fit was then fitted to the remaining
d. i. 12.33 ii. 49
i. 11.82 ii. 22.05
d is the distance stretched and h is the height. Substitution
3. a. Linear downwards was used to obtain the estimate.
b. The trend is linear. The estimation requires extrapolation and cannot be
c. About 60–65 occupants considered reliable. The presence of the outlier may indicate
variation in flexibility rather than a strong linear correlation
a. P = 31.82a + 13070.4, where P is the sale price and a is
d. Assumes that the current trend will continue.
between the data. Estimate is based on a small set of data.
4.

b. The price of land is approximately $31.82 per


the land area.

c. $64 000
square metre.

a. C = 0.15p + 11.09, where C is the money spent at the


2
d. 1160 m
5.
canteen and p represents the pocket money received.
b. Students spend 15 cents at the canteen per dollar

c. $18
received for pocket money.

d. $26. This involves extrapolation, which is considered


unreliable. It does not seem reasonable that, if a student
receives more money, they will eat more or have to

a. P = 0.91t + 2.95, where P is the number of pirouettes


purchase more than any other student.
6.
and t is the number of hours of training.
b. Ballet students can do approximately 0.91 pirouettes for
each hour of training.
c. Approximately 15 pirouettes.
d. Approximately 30 pirouettes. This estimate is based on
extrapolation, which is considered unreliable. To model
this data linearly as the number of hours of training

a. y = 3.31x + 3.05
becomes large is unrealistic.
7.
b. Approximately 53 hours.
8. Overall the data appears to be following a seasonal trend,
with peaks at either end of the year and a trough in
the middle.
9. a. Outliers can unfairly skew data and as such dramatically
alter the line of best fit. Identify and remove any outliers
from the data before determining the line of best fit.
b. Extrapolation involves making estimates outside the
data range and this is considered unreliable. When
extrapolation is required, consider the data and the
likelihood that the data would remain linear if extended.
When giving results, make comment on the validity of
the estimation.
c. A small range may not give a fair indication if a data
set shows a strong linear correlation. Try to increase the
range of the data set by taking more measurements or
undertaking more research.
d. A small number of data points may not be able to
establish with confidence the existence of a strong linear
correlation. Try to increase the number of data points
by taking more measurements or undertaking more
research.

886 Jacaranda Maths Quest 10 + 10A

You might also like