Regression MS
Regression MS
Let L1 be the regression line of x on y. The equation of the line L1 can be written in
the form x = ay + b.
Markscheme
[2 marks]
Markscheme
p = 28.7, q = 30.3 A2
[3 marks]
Let L1 be the regression line of x on y. The equation of the line L1 can be written in
the form x = ay + b.
Markscheme
[2 marks]
Let L2 be the regression line of y on x. The lines L1 and L2 pass through the same
point with coordinates (p , q).
Markscheme
p = 28.7, q = 30.3 A2
[3 marks]
(c) Jennifer was absent for the first test but scored 29 marks on the
second test. Use an appropriate regression equation to estimate
Jennifer’s mark on the first test. [2]
Markscheme
x= 27.1 A1
[2 marks]
(a) State the name for this type of sampling technique. [1]
Markscheme
Stratified sampling A1
[1 mark]
(b.i) Show that 3 students will be selected from grade 12. [3]
Markscheme
There are 260 students in total A1
84
260
× 9 = 2.91 M1A1
[3 marks]
(b.ii) Calculate the number of students in each grade in the sample. [2]
Markscheme
grade 9 = 60
260
× 9 ≈ 2, grade 10 = 83
260
× 9 ≈ 3, grade 11
=
33
260
× 9 ≈ 1 A2
[2 marks]
In order to select the 3 students from grade 12, the principal lists their names in
alphabetical order and selects the 28th, 56th and 84th student on the list.
(c) State the name for this type of sampling technique. [1]
Markscheme
Systematic sampling A1
[1 mark]
Once the principal has obtained the names of the 9 students in the random sample,
she surveys each student to find out how long they used social media the previous
day and measures their self-esteem using the Rosenberg scale. The Rosenberg scale
is a number between 10 and 40, where a high number represents high self-esteem.
(d.i) Calculate Pearson’s product moment correlation coefficient, r. [2]
Markscheme
r = −0.901 A2
[2 marks]
Markscheme
The negative value of r indicates that more time spent on social media leads
to lower self-esteem, supporting the principal’s concerns. R1
[1 mark]
Markscheme
[1 mark]
Markscheme
t = −0.281s + 9.74 A1
[4 marks]
(a) Use this model to estimate the number of children in the park on a
day when the highest temperature is 25 °C. [2]
Markscheme
2
y(25) = −0. 6 × 25 + 23 × 25 + 110
= 310 (children) A1
[2 marks]
An ice cream vendor investigates the relationship between the total number of
children visiting the park and the number of ice creams sold, x. The following table
shows the data collected on five different days.
Total number
81 175 202 346 360
of children (y)
Ice creams
15 27 23 35 46
sold (x)
(b) Find an appropriate regression equation that will allow the
vendor to predict the number of ice creams sold on a day when
there are y children in the park. [3]
Markscheme
x = 0. 0935y + 7. 43 A1
[3 marks]
(c) Hence, use your regression equation to predict the number of ice
creams that the vendor sells on a day when the highest
temperature is 25°C. [2]
Markscheme
attempt to substitute their answer to part (a) into their regression equation for
either x or y (M1)
36 (accept 37 or 36. 4) A1
Award (M1)A0FT for a correct FT answer that lies outside [15, 46].
[2 marks]
The regression line of y on x for this data can be written in the form y = ax + b.
Markscheme
1. 01206 … , 2. 45230 …
[2 marks]
Markscheme
0. 981464 …
r = 0. 981 A1
Note: A common error is to enter the data incorrectly into the GDC, and obtain
the answers a = 1. 01700 … , b = 2. 09814 … and
r = 0. 980888 … Some candidates may write the 3 sf answers, ie.
and A0 for part (b). Even though some values round to an accepted answer,
they come from incorrect working.
[1 mark]
(c) Use the equation of your regression line to predict the Science test
score for a student who has a score of 78 on the Mathematics test.
Express your answer to the nearest integer. [2]
Markscheme
81 A1
[2 marks]
The ages of the eldest child are summarized in the following box and whisker
diagram.
(a) Find the largest value of c that would not be considered an outlier. [3]
Markscheme
10 + 6
16 A1
[3 marks]
4
c + 20. The regression line of c on a is
a − 9.
1
c =
2
(b.i) One of the adults surveyed is 42 years old. Estimate the age of
their eldest child. [2]
Markscheme
choosing c =
1
2
a − 9 (M1)
1
× 42 − 9
2
= 12 (years old) A1
[2 marks]
(b.ii) Find the mean age of all the adults surveyed. [2]
Markscheme
34 (years old) A1
[2 marks]
Markscheme
r = 0. 883529 …
r = 0. 884 A1
Note: Award the (M1) for any correct value of r, a, b or r2 = 0. 780624 …
[2 marks]
Markscheme
a = 1. 37 , b = 64. 5 A1
[1 mark]
(c) One of these eight students was disappointed with her result and
wished she had practised more. Based on the given data,
determine how her score could have been expected to alter had
she practised an extra five hours per week. [2]
Markscheme
5 × 1. 36609 … OR
1. 36609 … (h + 5) + 64. 5171 … − (1. 36609 … h + 64. 5171 …)
6. 83045 …
[2 marks]
(d) Lucy asserts that the number of hours a student practises has a
direct effect on their final diploma result. Comment on the validity
of Lucy’s assertion. [1]
Markscheme
This might be true, but the data can only indicate a correlation. R1
[1 mark]
(e) Lucy suspected that each student had not been practising as much
as they reported. In order to compensate for this, Lucy deducted a
fixed number of hours per week from each of the students’
recorded hours.
Markscheme
no effect A1
[1 mark]
8. [Maximum mark: 7] 21M.2.SL.TZ1.2
The following table shows the data collected from an experiment.
Markscheme
a = 0. 433156 … , b = 4. 50265 …
a = 0. 433, b = 4. 50 A1A1
[2 marks]
(b) Use this model to predict the value of y when x = 18. [2]
Markscheme
y = 0. 433 × 18 + 4. 50
= 12. 2994 …
= 12. 3 A1
[2 marks]
(c)
¯
¯ Write down the value of x and the value of y . [1]
Markscheme
x = 15, y = 11
¯
¯ A1
[1 mark]
(d) Draw the line of best fit on the scatter diagram. [2]
Markscheme
A1A1
If the candidate does not use a ruler, award A0A1 where appropriate.
[2 marks]
Sarah, a regular customer, visited the café on five consecutive days. The following
table shows the number of customers, x, ahead of Sarah who have already ordered
and are waiting to receive their coffee and Sarah’s waiting time, y minutes.
Markscheme
[2 marks]
Markscheme
r = 0. 97777 …
r = 0. 978 A1
[1 mark]
Markscheme
a represents the (average)increase in waiting time (0. 805 mins) per
additional customer (waiting to receive their coffee) R1
[1 mark]
(c) On another day, Sarah visits the café to order a coffee. Seven
customers have already ordered their coffee and are waiting to
receive it.
Use the result from part (a)(i) to estimate Sarah’s waiting time to
receive her coffee. [2]
Markscheme
8. 51693 …
8. 52 (mins) A1
[2 marks]
Markscheme
[3 marks]
Markscheme
−0. 981244
r = −0. 981 A1 N1
[1 mark]
Markscheme
eg −9. 85 × 12 + 222
[2 marks]
Markscheme
4.30161, 163.330
[3 marks]
Markscheme
valid approach (M1)
[3 marks]
It can be assumed that (X, Y ) follow a bivariate normal distribution with product
moment correlation coefficient ρ.
Markscheme
H0 : ρ = 0 H1 : ρ ≠ 0 A1
Note: It must be ρ.
[1 mark]
p = 0.649 A2
Note: The A mark depends on the R mark and the answer must be given in
context. Follow through the p-value in part (b).
[4 marks]
Markscheme
a statement along along the lines of ‘(we have accepted that) the two
variables are independent’ or ‘the two variables are weakly correlated’ R1
a statement along the lines of ‘the use of the regression line is invalid’ or ‘it
would give an inaccurate result’ R1
[2 marks]
13. [Maximum mark: 6] 19M.1.SL.TZ2.T_2
Colorado beetles are a pest, which can cause major damage to potato crops. For a
certain Colorado beetle the amount of oxygen, in millilitres (ml), consumed each
day increases with temperature as shown in the following table.
Markscheme
* This question is from an exam for a previous syllabus, and may contain minor
differences in marking or structure.
Note: Award (A1) for 15.5x; (A1) for −80. Award at most (A1)(A0) if answer is not
an equation. Award (A0)(A1)(ft) for y = −80x + 15.5.
[2 marks]
The mean point has coordinates (20, 230).
Markscheme
(A1)(A1) (C2)
Note: Award (A1) for a straight line using a ruler passing through (20, 230); (A1)
for correct y-intercept. If a ruler has not been used, award at most (A0)(A1).
[2 marks]
Markscheme
a = 10 AND b = 30 (A1)(A1) (C2)
[2 marks]
Markscheme
* This question is from an exam for a previous syllabus, and may contain minor
differences in marking or structure.
Note: Award (A1) for 15.5x; (A1) for −80. Award at most (A1)(A0) if answer is not
an equation. Award (A0)(A1)(ft) for y = −80x + 15.5.
[2 marks]
Markscheme
(A1)(A1) (C2)
Note: Award (A1) for a straight line using a ruler passing through (20, 230); (A1)
for correct y-intercept. If a ruler has not been used, award at most (A0)(A1).
[2 marks]
Markscheme
[2 marks]
Jill is doing a 1000-piece jigsaw puzzle. She started by sorting the edge pieces from
the interior pieces. Six times she stopped and counted how many of each type she
had found. The following table indicates this information.
Jill models the relationship between these variables using the regression equation
y = ax + b.
Markscheme
* This question is from an exam for a previous syllabus, and may contain minor
differences in marking or structure.
a = 6.92986, b = 8.80769
[3 marks]
(b) Use the model to predict how many edge pieces she had found
when she had sorted a total of 750 pieces. [3]
Markscheme
[3 marks]
16. [Maximum mark: 16] 19M.2.SL.TZ1.T_1
A healthy human body temperature is 37.0 °C. Eight people were medically
examined and the difference in their body temperature (°C), from 37.0 °C, was
recorded. Their heartbeat (beats per minute) was also recorded.
Markscheme
* This question is from an exam for a previous syllabus, and may contain minor
differences in marking or structure.
(A4)
Note: Award (A1) for correct scales, axis labels, minimum x = −0.3, and
minimum y = 60. Award (A0) if axes are reversed and follow through for their
points.
[4 marks]
(b.i) Write down, for this set of data the mean temperature difference
from 37 °C, x̄. [1]
Markscheme
0.025 ( 40
1
) (A1)
[1 mark]
(b.ii) Write down, for this set of data the mean number of heartbeats per
minute, ȳ. [1]
Markscheme
74 (A1)
[1 mark]
(c) Plot and label the point M(x̄, ȳ) on the scatter diagram. [2]
Markscheme
Note: Award (A1) for labelled M. Do not accept any other label. Award (A1)(ft)
for their point M correctly plotted. Follow through from part (b).
[2 marks]
(d.i) Use your graphic display calculator to find the Pearson’s product–
moment correlation coefficient, r. [2]
Markscheme
[2 marks]
Markscheme
Note: Award (A1) for (moderately) strong, (A1) for positive. Follow through
from part (d)(i). If there is no answer to part (d)(i), award at most (A0)(A1).
[2 marks]
(e) Use your graphic display calculator to find the equation of the
regression line y on x. [2]
Markscheme
[2 marks]
Markscheme
[2 marks]
Markscheme
0.141120, 11.1424
[3 marks]
0.977563
r= 0.978 A1 N1
[1 mark]
(b) Use the regression equation to estimate the BMI of an adult man
whose waist size is 95 cm. [2]
Markscheme
eg 0.141(95) + 11.1
24.5488
24.5 A1 N2
[2 marks]
(a.i) Use your graphic display calculator to write down x̄, the mean
project mark. [1]
Markscheme
14 (G1)
[1 mark]
(a.ii) Use your graphic display calculator to write down ȳ, the mean
examination score. [1]
Markscheme
54 (G1)
[1 mark]
Markscheme
0.5 (G2)
[2 marks]
(b.i) Find the exact value of m and of c for these data. [2]
Markscheme
m = 0.875, c = 41.75 (m =
7
8
, c =
167
4
) (A1)(A1)
Note: Award (A1) for 0.875 seen. Award (A1) for 41.75 seen. If 41.75 is rounded to
41.8 do not award (A1).
[2 marks]
(b.ii) Show that the point M (x̄, ȳ) lies on the regression line y on x. [2]
Markscheme
Note: Award (M1) for their correct substitution into their regression line. Follow
through from parts (a)(i) and (b)(i).
= 54
Note: Do not award (A1) unless the conclusion is explicitly stated and the 54
seen. The (A1) can be awarded only if their conclusion is consistent with their
equation and it lies on the line.
OR
54 = 54
Note: Award (M1) for their correct substitution into their regression line. Follow
through from parts (a)(i) and (b)(i).
Note: Do not award (A1) unless the conclusion is explicitly stated. Follow
through from part (a).
[2 marks]
Markscheme
Note: Award (M1) for correct substitution into their regression line.
[2 marks]
since this is interpolation and the correlation coefficient is large enough (R1)
OR
Note: Do not award (A1)(R0). The (R1) may be awarded for reasoning based on
strength of correlation, but do not accept “correlation coefficient is not strong
enough” or “correlation is not large enough”.
Award (A0)(R0) for this method if no numerical answer to part (a)(iii) is seen.
[2 marks]
Markscheme
56.6−65
65
× 100 (M1)
Note: Award (M1) for correct substitution into percentage error formula.
Follow through from part (c)(i).
Note: Follow through from part (c)(i). Condone use of percentage symbol.
Award (G0) for an answer of −12.9 with no working.
[2 marks]
The relationship between x and y can be modelled by the regression line with
equation y = ax + b.
Markscheme
9.91044, −31.3194
[3 marks]
Markscheme
0.986417
r = 0.986 A1 N1
[1 mark]
(b) Another athlete on this sports team has a hand length of 21.5 cm.
Use the regression equation to estimate the height of this athlete. [2]
Markscheme
eg 9.91(21.5) − 31.3
181.755
182 (cm) A1 N2
[2 marks]
It is believed that the concentration of dissolved oxygen in the river varies linearly
with the temperature.
(a.i) For these data, find Pearson’s product-moment correlation
coefficient, r. [2]
Markscheme
Note: Award (A1) for an answer of 0.974 (minus sign omitted). Award (A1) for an
answer of −0.973 (incorrect rounding).
[2 marks]
(a.ii) For these data, find the equation of the regression line y on x. [2]
Markscheme
Note: Award (A1) for −0.365x, (A1) for 17.9. Award at most (A1)(A0) if not an
equation or if the values are reversed (eg y = 17.9x −0.365).
[2 marks]
Markscheme
Note: Award (M1) for correctly substituting 18 into their part (a)(ii).
(a) Plot and label the point M(m̄, p̄) on the scatter diagram. [2]
Markscheme
* This question is from an exam for a previous syllabus, and may contain minor
differences in marking or structure.
(A1)(A1) (C2)
Note: Award (A1) for mean point plotted and (A1) for labelled M.
[2 marks]
(b) Draw the line of best fit, by eye, on the scatter diagram. [2]
Markscheme
straight line through their mean point crossing the p-axis at 5±2 (A1)(ft)(A1)(ft)
(C2)
Note: Award (A1)(ft) for a straight line through their mean point. Award (A1)(ft)
for a correct p-intercept if line is extended.
[2 marks]
(c) Using your line of best fit, estimate the physics test score for a
student with a score of 20 in their mathematics test. [2]
Markscheme
point on line where m = 20 identified and an attempt to identify y-coordinate
(M1)
[2 marks]