Unit 6
1. We take a random sample of 18 STAT 2000 students and record the amount of time X
they spend studying for the final exam and their score on the exam Y . We would like to
conduct a hypothesis test to determine whether there exists a positive linear relationship
between study time and final exam score. The test statistic is calculated to be t = 2.90.
The P-value for the appropriate test of significance is:
(A) between 0.0025 and 0.005.
(B) between 0.005 and 0.01.
(C) between 0.01 and 0.02.
(D) between 0.025 and 0.05.
(E) between 0.05 and 0.10.
The next four questions (?? to ??) refer to the following:
To most Canadians, earthquakes are viewed as rare occurrences, but they are actually
quite common. In just one month in 2001, Natural Resources Canada recorded 215
earthquakes that affected Canada from B.C. to Nunavut to Newfoundland. We would
like to determine whether there is a linear relationship between the location of an earth-
quake X (measured in degrees latitude north of the equator) and the magnitude of the
earthquake Y (measured on the Richter scale). The explanatory and response variables
are measured for a sample of 13 earthquakes. The equation of the least squares regression
line is ŷ = −3.05 + 0.10x. The ANOVA table is shown below:
Source of Variation df Sum of Squares Mean Square F
Regression
Error 1.40
Total 25.10
2. One earthquake occurred at 58◦ North latitude and had a magnitude of 1.4. What is
the value of the residual for this earthquake?
(A) −2.75 (B) −1.35 (C) 0.65 (D) 1.35 (E) 2.75
3. What is the value of the sample correlation r?
(A) 0.62 (B) 0.39 (C) 0.78 (D) 0.80 (E) 0.59
4. What is the estimate of the parameter σ in the regression model?
(A) 1.18 (B) 1.40 (C) 2.06 (D) 2.54 (E) 3.12
5. What is the P-value for the appropriate test of significance?
(A) between 0.001 and 0.01
(B) between 0.01 and 0.025
(C) between 0.025 and 0.05
(D) between 0.05 and 0.10
(E) greater than 0.10
6. We take a random sample of individuals and measure the values of some explanatory
variable X and some response variable Y . Which of the following is the appropriate
linear regression model?
(A) ŷi = b0 + b1 xi + i , where i ∼ N (0, σ)
(B) yi = β0 + β1 xi + i , where i ∼ N (µ, σ)
(C) ŷi = β0 + β1 xi + i , where i ∼ N (0, 1)
(D) yi = β0 + β1 xi + i , where i ∼ N (0, σ)
(E) ŷi = b0 + b1 xi + i , where i ∼ N (µ, σ)
2
7. We take a random sample of 14 textbooks in the university bookstore and we record
the number of pages X and the price Y of each book. We conduct a test of H0 : β1 = 0
vs. Ha : β1 > 0 at the 2% level of significance to determine if there is a positive linear
relationship between the two variables. The test statistic is calculated to be t = 2.20.
Using the critical value method, the correct conclusion is to:
(A) fail to reject H0 , since t < 2.282.
(B) reject H0 , since t < 2.282.
(C) fail to reject H0 , since t < 2.303.
(D) reject H0 , since t < 2.303.
(E) fail to reject H0 , since t < 2.681.
The next three questions (?? to ??) refer to the following:
Can the age of a cow be used to predict its milk production? The ages of eight cows (in
years) and their milk production (in gallons per week) are shown below:
Age 4 4 6 7 7 8 10 11
Milk Production 37.0 35.4 33.3 35.6 32.3 33.7 32.1 29.6
A regression analysis is conducted and the equation of the least squares regression line
is found to be ŷ = 39.297 − 0.796x. It is also determined that 73.2%
P of the2 variation in
milk
P production can be accounted for by age. We also calculate (yi − ŷi ) = 7.90 and
2
(xi − x̄) = 44.9.
8. What is the sample correlation between age and milk production?
(A) −0.536 (B) −0.732 (C) −0.796 (D) −0.856 (E) −0.892
9. A 90% confidence interval for the parameter β1 in the linear regression model is:
(A) (−1.018, −0.574)
(B) (−1.129, −0.463)
(C) (−1.240, −0.352)
(D) (−1.351, −0.241)
(E) (−1.462, −0.130)
3
10. We conduct a hypothesis test of H0 : β1 = 0 vs. Ha : β1 < 0 to determine whether there
exists a negative linear relationship between the age of a cow and its milk production.
The P-value for the appropriate test of significance is:
(A) between 0.001 and 0.0025.
(B) between 0.0025 and 0.005.
(C) between 0.005 and 0.01.
(D) between 0.01 and 0.02.
(E) between 0.02 and 0.025.
The next four questions (?? to ??) refer to the following:
Can a student’s final exam score be predicted by his or her midterm score? The midterm
score and the final exam score (both out of 100) for a sample of nine STAT 2000 stu-
dents are recorded. A least squares regression analysis is conducted. The least squares
regression line is calculated to be ŷ = −0.97 + 0.98x. The ANOVA table (with some
values missing) is shown below:
Source of Variation df Sum of Squares Mean Square F
Regression
Error 7 754
Total 2673
11. One student in the sample had a final exam score of 90, and the value of the residual
for this student was 7.67. What was this student’s midterm score?
(A) 81 (B) 82 (C) 83 (D) 84 (E) 85
12. What is the value of the sample correlation between midterm score and final exam score?
(A) 0.282 (B) 0.531 (C) 0.656 (D) 0.718 (E) 0.847
13. What is the estimate of the parameter σ in the simple linear regression model?
(A) 6.3 (B) 7.5 (C) 8.1 (D) 9.2 (E) 10.4
14. We would like to conduct a hypothesis test to determine whether there exists a linear re-
lationship between midterm score and final exam score. The P-value for the appropriate
test of significance is:
4
(A) less than 0.001.
(B) between 0.001 and 0.01.
(C) between 0.01 and 0.025.
(D) between 0.025 and 0.05.
(E) between 0.05 and 0.10.
15. Which of the following statements about residuals is false?
(A) The least squares regression line minimizes the sum of squared residuals.
(B) Residuals are used to estimate the parameter σ in the linear regression model.
(C) The sum of residuals in least squares regression is always equal to zero.
(D) A negative residual indicates that a point falls below the least squares regression
line.
(E) A residual plot that displays a random scatter of points is a good indication that
the assumptions in the linear regression model are invalid.
The next two questions (?? and ??) refer to the following:
We record the salary X (in $millions) and the number of points scored last season for a
sample of 15 NHL players. A regression analysis is conducted and the equation of the
least squares regression
P line is 2calculated to be P
ŷ = 26.94 + 4.87x. We also calculate
x̄ = 5.23, ȳ = 52.40, (yi − ŷi ) = 4417.17 and (xi − x̄)2 = 96.93.
16. We conduct a hypothesis test of H0 : β1 = 0 vs. Ha : β1 > 0 to determine whether there
exists a positive linear relationship between a player’s salary and the number of points
he scores. The P-value for the appropriate test of significance is:
(A) between 0.01 and 0.02.
(B) between 0.02 and 0.025.
(C) between 0.025 and 0.05.
(D) between 0.05 and 0.10.
(E) between 0.10 and 0.15.
17. Which of the following is a 90% confidence interval for the number of points scored for
a player who earns $3 million per year?
(A) (33.12, 49.98)
5
(B) (32.19, 50.91)
(C) (31.43, 51.67)
(D) (30.34, 52.76)
(E) (29.18, 53.92)
The next three questions (?? to ??) refer to the following:
A statistician wanted to determine if the demographic variables of age, education, and
income influence the number of hours of television watched per week. A random sample
of 25 adults was selected to estimate the multiple regression model:
Y = β0 + β1 X1 + β2 X2 + β3 X3 + ,
where Y is the number of hours of television watched last week, X1 is the age (in years),
X2 is the number of years of education, and X3 is income (in $1000s). The ANOVA
table (with some values missing) is shown below:
Source of Variation df Sum of Squares Mean Square F
Regression
Error 20.286
Total 653
18. In order to test whether any of the explanatory variables are important in predicting the
response variable Y , the statement of the null hypothesis is:
(A) H0 : β0 = 0
(B) H0 : β0 = β1 = β2 = β3 = 0
(C) H0 : β1 = β2 = β3 = 0
(D) H0 : β0 = β1 = β2 = β3 6= 0
(E) H0 : at least one of β1 , β2 , β3 is not 0.
19. In order to test whether any of the explanatory variables are important in predicting the
response variable Y , at the 5% level of significance, the critical region of the test is:
(A) F > 2.36 (B) F < 3.07 (C) F > 3.48 (D) F < 3.73 (E) F > 3.07
20. In order to test whether any of the explanatory variables are important in predicting the
response variable Y , the observed value of the test statistic is:
(A) 3.07 (B) 3.73 (C) 20.28 (D) 75.67 (E) 3.48
6
Answers
1. B 11. E
2. B 12. E
3. A 13. E
4. A 14. B
5. B 15. E
6. D 16. A
7. C 17. D
8. D 18. C
9. B 19. E
10. A 20. B