THE HONG KONG POLYTECHNIC UNIVERSITY
Department of Applied Mathematics
Subject Code: AMA3602
Subject Title: Applied Linear Models for Finance Analytics
Date: June, 2023
Time:
Time Allowed: 2 hours
This question paper has 18 pages and is book-closed.
Please write your answers on the answer sheets provided.
Do not write your answers on this paper.
Instructions: This paper has SIX questions. Make sure to complete all subquestions.
Subject Examiner: Dr. Catherine LIU
Subject Moderator: Dr. HAN Ruijian
Student ID:
Student Name:
DO NOT TURN OVER THE PAGE UNTIL YOU ARE TOLD TO DO SO.
Question 1: True or False Questions
Judge whether the statements below are right or not by writing down T (true) or F (false) on
your answer sheet.
The sub-questions 1.1-1.3 are about the multiple linear regression (MLR) that we have learned in
chapter 2, which relates response y and regressors x1 , · · · , xk based on the data set (Yi , Xi1 , · · · ,
Xik )ni=1 , in the form of yi = β0 + x1 β1 + · · · + xk βk + ϵi , for i = 1, · · · , n. In matrix notation,
the MLR above is written as Y = Xβ + ϵ, where X is called the design matrix and Y and ϵ are
n-dimensional vectors.
1.1) The hat matrix H = X(XT X)−1 XT maps the vector of fitted values into a vector of observed
values. [1 mark]
1.2) The least squares estimator β̂ is the best unbiased estimator of the coefficient vector β. [1 mark]
1.3) For the regression coefficient vector β, the maximum likelihood estimators are identical to the
least squares estimators provided that the distributions of the model errors are given and iden-
tically independent distributed. [1 mark]
1.4) Under the setting of a generalized linear model, the inverse function of a link function is also
called a link function since a link function is monotonic. [1 mark]
1.5) For the Laird-Ware two-level linear mixed-effects model
yij = β1 x1ij + · · · + βp xpij + b1i z1ij + · · · + bqi zqij + ϵij , i = 1, · · · , N, j = 1, · · · , ni
one knows that the regression coefficients β1 , · · · , βp are the fixed-effect coefficients, and b1i , · · · , bqi
are the random-effect coefficients for cluster i; x1ij , · · · , xpij are fixed constant regressors, and
z1ij , · · · , zqij are the random-effect regressors. Typically {x1ij , · · · , xpij } is a subset of {z1ij , · · · , zqij }.
[1 mark]
1.6) For a qualitative variable with m levels involved in a multiple linear regression, one can represent
it by m − 1 indicator variables. [1 mark]
1.7) To deal with the multiple linear regression model with qualitative predictor variables, we will use
indicator variables or dummy variables. One of the advantages of using indicator variables is that
we can use the extra-sum-of-square method to conduct hypothesis tests directly. For example,
we can directly use the extra-sum-of-square method to test whether or not the regression lines
have a common slope but possibly different intercepts. [1 mark]
1.8) Akaike information criterion places a greater penalty on adding regressors as the sample size
increase than Bayesian information criterion. [1 mark]
2
1.9) Collecting additional data may be the best method of combating multicollinearity if such data
are collected in a manner designed to break up the multicollinearty in the existing data.
[1 mark]
1.10) The ridge estimator produces the vector of regression coefficients with the largest norm consistent
with a specified decrease in the residual sum of squares. [1 mark]
3
The questions from Question 2 to Question 6 are based on the dataset below:
A data analyst from a car company is interested in examining the relationship between design
style and fuel consumption performance for 32 automobiles. The data set contains 32 observations
on 11 variables.
Notation Variable Description
Y mpg Miles/(US) gallon
X1 cyl number of cylinders
X2 hp Gross horsepower
X3 wt Weight (lb/1000)
X4 am Transmission (0 = automatic, 1 = manual)
X5 disp displacement (cu.in.)
X6 drat Rear axle ratio
X7 qsec 1/4 mile time
X8 vs V/S
X9 gear Number of forward gears
X10 carb Number of carburetors
First we focus on the association between the response variable mpg (Y ), and the explanatory
variables cyl (X1 ), hp (X2 ), and wt (X3 ).
4
To answer the Question 2, refer to Appendix: Code and Output Part I.
Question 2: Multiple Linear Regression (MLR) [Total: 20 marks]
2.1 Fit a MLR model relating Y to X1 , X2 , and X3 , based on the result of least-squares
estimation (LSE).
[3 marks]
2.2 Construct the analysis-of-variance table for the model above. [5 marks]
2.3 Test for significance of regression by the p-value method.
Hint: give the full procedure, including null and alternative hypotheses, test statistic,
decision rule, and etc.
[5 marks]
2.4 Use the general linear hypothesis approach to test H0 : β1 − β2 = 0, β2 + β3 = 0 by
transforming into H0 : Tβ = 0 (α = 0.05). [7 marks]
To answer the Question 3, refer to Appendix: Code and Output Part II.
Question 3: Indicator Variables [Total: 15 marks]
We are now interested in involving the indicator variable am (X4 ) and studying its effects on the
response. We still assume a MLR model:
Y = β0 + β1 X1 + β2 X2 + β3 X3 + β4 X4 + ε. (1)
3.1 Fit a MLR model relating Y to the original three explanatory variables and X4 based on
the result of least-squares estimation (LSE). [3 marks]
3.2 What can you say about the differences between the two regression lines when X4 = 0
and X4 = 1, and how do you interpret the meaning of the new β4 term, based on the
corresponding code and output? [(3+3) marks]
3.3 Add a new interaction term between X1 and X4 to the model (1), that is
Y = β0 + β1 X1 + β2 X2 + β3 X3 + β4 X4 + β14 X1 X4 + ε. (2)
Fit the model based on the result of least-squares estimation (LSE),
and what can you conclude about the significance of this interaction term (α = 0.05)?
[(2+4) marks]
5
To answer the Question 4, refer to Appendix: Code and Output Part III.
Question 4: Multicollinearity and Ridge Regression [Total: 15 marks]
We wish to inspect the entire data set for possible multicollinearity issues. Therefore, we now
consider relating Y to X1 -X10 , all 10 explanatory variables in the data set, and examine if there
is multicollinearity among the explanatory variables.
4.1 Based on the matrix of correlations between all possible regressors (10 in total), which
variables are highly correlated? What are the major characteristics of such correlations?
[(3+2) marks]
4.2 Find the variance inflation factors (VIFs). Based on the VIFs, is there evidence of multi-
collinearity in these data and why?
[(2+3) marks]
4.3 To combat the problem of multicollinearity, we fit ridge regression models for different
values of k. Based on the ridge trace plot, which value of k is appropriate and why?
[5 marks]
To answer the Question 5, refer to Appendix: Code and Output Part IV.
Question 5: Variable Selection [Total: 20 marks]
We have detected the existence of multicollinearity in the data set. Now we employ variable
selection methods to obtain reduced models to fit the data.
5.1 This sub-question is based on code block 1 under part IV (refer to pages 14-16). Based
on the output, which selection method is used to select the appropriate model (forward,
backward, or both)? Which criterion is the selection based on (AIC or BIC)?
[(3+3) marks]
5.2 This sub-question is based on code block 2 under part IV (refer to pages 16-17). Based
on the output, which selection method is used to select the appropriate model (forward,
backward, or both)? Which criterion is the selection based on (AIC or BIC)?
[(3+3) marks]
5.3 Determine the selected models by the above two procedures. [(2+2) marks]
5.4 What is one key similarity between Mallow’s Cp, AIC, and BIC methods? What is one key
difference between Mallow’s Cp, AIC, and BIC methods? [(2+2) marks]
6
To answer the Question 6, refer to Appendix: Code and Output Part V.
Question 6: Random Effects Model [Total: 20 marks]
Suppose the observations we obtained are grouped by the variable am (X4 ). Now we want to
study the fixed effects of cyl (X1 ) and hp (X2 ) on the response mpg (Y ). The random effects of
am (X4 ) should be included in the model.
6.1 Based on the result of residual maximum likelihood (REML), fit a random effects model
relating Y to the fixed effects of X1 and X2 and the random effects of X4 . [4 marks]
6.2 Examine the fitted model, based on the output, which fixed effects should we keep in the
model at the significance level of α = 0.05?
Hint: No need to give details of the test procedure. [4 marks]
6.3 Based on the result of maximum likelihood (MLE), fit a random effects model relating Y
to the fixed effects of X1 and X2 and the random effects of X4 . [4 marks]
6.4 Compare the two models fitted by different methods. What is one disadvantage of the MLE
method when compared to the REML method? [(4+4) marks]
END
7
Appendix: Code and Output
Part I
data(mtcars)
str(mtcars)
## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
head(mtcars,5)
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
attach(mtcars)
model1 <- lm(mpg~cyl+hp+wt)
summary(model1)
##
## Call:
## lm(formula = mpg ~ cyl + hp + wt)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9290 -1.5598 -0.5311 1.1850 5.8986
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 38.75179 1.78686 21.687 < 2e-16 ***
## cyl -0.94162 0.55092 -1.709 0.098480 .
## hp -0.01804 0.01188 -1.519 0.140015
## wt -3.16697 0.74058 -4.276 0.000199 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
8
##
## Residual standard error: 2.512 on 28 degrees of freedom
## Multiple R-squared: 0.8431, Adjusted R-squared: 0.8263
## F-statistic: 50.17 on 3 and 28 DF, p-value: 2.184e-11
model2 <- lm(mpg~cbind(cyl,hp,wt))
anova(model2)
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## cbind(cyl, hp, wt) 3 949.43 316.48 50.172 2.184e-11 ***
## Residuals 28 176.62 6.31
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fmodel <- lm(mpg~cyl+hp+wt)
anova(fmodel)
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## cyl 1 817.71 817.71 129.6336 5.093e-12 ***
## hp 1 16.36 16.36 2.5935 0.1185183
## wt 1 115.35 115.35 18.2873 0.0001995 ***
## Residuals 28 176.62 6.31
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
z1 <- cyl+hp
z2 <- hp-wt
rmodel <- lm(mpg~z1+z2)
anova(rmodel)
## Analysis of Variance Table
##
## Response: mpg
## Df Sum Sq Mean Sq F value Pr(>F)
## z1 1 687.45 687.45 99.336 7.117e-11 ***
## z2 1 237.91 237.91 34.377 2.320e-06 ***
## Residuals 29 200.69 6.92
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
qf(0.95,2,29)
## [1] 3.327654
9
Part II
model3 <- lm(mpg~cyl+hp+wt+am)
summary(model3)
##
## Call:
## lm(formula = mpg ~ cyl + hp + wt + am)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4765 -1.8471 -0.5544 1.2758 5.6608
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.14654 3.10478 11.642 4.94e-12 ***
## cyl -0.74516 0.58279 -1.279 0.2119
## hp -0.02495 0.01365 -1.828 0.0786 .
## wt -2.60648 0.91984 -2.834 0.0086 **
## am 1.47805 1.44115 1.026 0.3142
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.509 on 27 degrees of freedom
## Multiple R-squared: 0.849, Adjusted R-squared: 0.8267
## F-statistic: 37.96 on 4 and 27 DF, p-value: 1.025e-10
model4 <- lm(mpg~cyl+hp+wt+am+cyl*am)
summary(model4)
##
## Call:
## lm(formula = mpg ~ cyl + hp + wt + am + cyl * am)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0803 -1.4815 -0.7691 1.3676 5.2401
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 34.65302 3.24278 10.686 5.23e-11 ***
## cyl -0.66455 0.57644 -1.153 0.25946
## hp -0.01641 0.01480 -1.109 0.27760
## wt -2.72198 0.90900 -2.994 0.00596 **
## am 6.32643 3.80414 1.663 0.10831
## cyl:am -0.89996 0.65523 -1.373 0.18133
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.469 on 26 degrees of freedom
## Multiple R-squared: 0.8592, Adjusted R-squared: 0.8322
## F-statistic: 31.74 on 5 and 26 DF, p-value: 2.791e-10
10
Part III
cor(mtcars[,-1])
## cyl disp hp drat wt qsec
## cyl 1.0000000 0.9020329 0.8324475 -0.69993811 0.7824958 -0.59124207
## disp 0.9020329 1.0000000 0.7909486 -0.71021393 0.8879799 -0.43369788
## hp 0.8324475 0.7909486 1.0000000 -0.44875912 0.6587479 -0.70822339
## drat -0.6999381 -0.7102139 -0.4487591 1.00000000 -0.7124406 0.09120476
## wt 0.7824958 0.8879799 0.6587479 -0.71244065 1.0000000 -0.17471588
## qsec -0.5912421 -0.4336979 -0.7082234 0.09120476 -0.1747159 1.00000000
## vs -0.8108118 -0.7104159 -0.7230967 0.44027846 -0.5549157 0.74453544
## am -0.5226070 -0.5912270 -0.2432043 0.71271113 -0.6924953 -0.22986086
## gear -0.4926866 -0.5555692 -0.1257043 0.69961013 -0.5832870 -0.21268223
## carb 0.5269883 0.3949769 0.7498125 -0.09078980 0.4276059 -0.65624923
## vs am gear carb
## cyl -0.8108118 -0.52260705 -0.4926866 0.52698829
## disp -0.7104159 -0.59122704 -0.5555692 0.39497686
## hp -0.7230967 -0.24320426 -0.1257043 0.74981247
## drat 0.4402785 0.71271113 0.6996101 -0.09078980
## wt -0.5549157 -0.69249526 -0.5832870 0.42760594
## qsec 0.7445354 -0.22986086 -0.2126822 -0.65624923
## vs 1.0000000 0.16834512 0.2060233 -0.56960714
## am 0.1683451 1.00000000 0.7940588 0.05753435
## gear 0.2060233 0.79405876 1.0000000 0.27407284
## carb -0.5696071 0.05753435 0.2740728 1.00000000
library(car)
## Warning: package 'car' was built under R version 4.2.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.2.2
vif(lm(mpg~.,data=mtcars))
## cyl disp hp drat wt qsec vs am
## 15.373833 21.620241 9.832037 3.374620 15.164887 7.527958 4.965873 4.648487
## gear carb
## 5.357452 7.908747
X <- as.matrix(mtcars[,-1])
Y <- scale(as.matrix(mtcars[,1]))
XtX <- cor(X)
XtY <- cor(X,Y)
X <- X/sqrt(colSums(Xˆ2))
Y <- Y/sqrt(sum(Yˆ2))
p <- ncol(X)
n <- nrow(X)
k.cand <- c(2ˆc(0:6)*1e-3)
beta.ridge.hist <- msres.hist <- NULL
for(k in k.cand){
beta.tmp <- solve(XtX+k*diag(p))%*%XtY
msres.tmp <- t((Y-X%*%beta.tmp))%*%(Y-X%*%beta.tmp)/(n-p-1)
beta.ridge.hist <- cbind(beta.ridge.hist,beta.tmp)
msres.hist <- c(msres.hist,msres.tmp)
11
}
beta.ridge <- as.data.frame(beta.ridge.hist)
names(beta.ridge) <- k.cand
tab <- rbind(beta.ridge,msres.hist)
row.names(tab)[11] <- "MS_Res"
round(tab,3)
## 0.001 0.002 0.004 0.008 0.016 0.032 0.064
## cyl -0.033 -0.032 -0.032 -0.033 -0.038 -0.049 -0.066
## disp 0.258 0.243 0.217 0.174 0.114 0.046 -0.017
## hp -0.239 -0.234 -0.224 -0.210 -0.190 -0.169 -0.153
## drat 0.071 0.072 0.073 0.076 0.079 0.082 0.085
## wt -0.589 -0.577 -0.554 -0.516 -0.462 -0.397 -0.333
## qsec 0.238 0.232 0.222 0.205 0.179 0.145 0.108
## vs 0.027 0.027 0.027 0.027 0.029 0.031 0.037
## am 0.208 0.207 0.206 0.203 0.198 0.191 0.180
## gear 0.081 0.082 0.083 0.084 0.085 0.084 0.080
## carb -0.061 -0.068 -0.081 -0.101 -0.128 -0.155 -0.174
## MS_Res 186.646 167.061 135.041 91.068 46.239 18.981 16.878
matplot(k.cand,t(beta.ridge),type="l",xlab="k",ylab=expression(hat(beta)[R]),main="Ridge trace plot")
Ridge trace plot
0.2
0.0
βR
−0.2
^
−0.4
−0.6
0.00 0.01 0.02 0.03 0.04 0.05 0.06
12
Part IV
# Code block 1
full <- lm(mpg~.,data=mtcars)
step(full,direction="backward",k=2)
## Start: AIC=70.9
## mpg ~ cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb
##
## Df Sum of Sq RSS AIC
## - cyl 1 0.0799 147.57 68.915
## - vs 1 0.1601 147.66 68.932
## - carb 1 0.4067 147.90 68.986
## - gear 1 1.3531 148.85 69.190
## - drat 1 1.6270 149.12 69.249
## - disp 1 3.9167 151.41 69.736
## - hp 1 6.8399 154.33 70.348
## - qsec 1 8.8641 156.36 70.765
## <none> 147.49 70.898
## - am 1 10.5467 158.04 71.108
## - wt 1 27.0144 174.51 74.280
##
## Step: AIC=68.92
## mpg ~ disp + hp + drat + wt + qsec + vs + am + gear + carb
##
## Df Sum of Sq RSS AIC
## - vs 1 0.2685 147.84 66.973
## - carb 1 0.5201 148.09 67.028
## - gear 1 1.8211 149.40 67.308
## - drat 1 1.9826 149.56 67.342
## - disp 1 3.9009 151.47 67.750
## - hp 1 7.3632 154.94 68.473
## <none> 147.57 68.915
## - qsec 1 10.0933 157.67 69.032
## - am 1 11.8359 159.41 69.384
## - wt 1 27.0280 174.60 72.297
##
## Step: AIC=66.97
## mpg ~ disp + hp + drat + wt + qsec + am + gear + carb
##
## Df Sum of Sq RSS AIC
## - carb 1 0.6855 148.53 65.121
## - gear 1 2.1437 149.99 65.434
## - drat 1 2.2139 150.06 65.449
## - disp 1 3.6467 151.49 65.753
## - hp 1 7.1060 154.95 66.475
## <none> 147.84 66.973
## - am 1 11.5694 159.41 67.384
## - qsec 1 15.6830 163.53 68.200
## - wt 1 27.3799 175.22 70.410
##
## Step: AIC=65.12
## mpg ~ disp + hp + drat + wt + qsec + am + gear
##
13
## Df Sum of Sq RSS AIC
## - gear 1 1.565 150.09 63.457
## - drat 1 1.932 150.46 63.535
## <none> 148.53 65.121
## - disp 1 10.110 158.64 65.229
## - am 1 12.323 160.85 65.672
## - hp 1 14.826 163.35 66.166
## - qsec 1 26.408 174.94 68.358
## - wt 1 69.127 217.66 75.350
##
## Step: AIC=63.46
## mpg ~ disp + hp + drat + wt + qsec + am
##
## Df Sum of Sq RSS AIC
## - drat 1 3.345 153.44 62.162
## - disp 1 8.545 158.64 63.229
## <none> 150.09 63.457
## - hp 1 13.285 163.38 64.171
## - am 1 20.036 170.13 65.466
## - qsec 1 25.574 175.67 66.491
## - wt 1 67.572 217.66 73.351
##
## Step: AIC=62.16
## mpg ~ disp + hp + wt + qsec + am
##
## Df Sum of Sq RSS AIC
## - disp 1 6.629 160.07 61.515
## <none> 153.44 62.162
## - hp 1 12.572 166.01 62.682
## - qsec 1 26.470 179.91 65.255
## - am 1 32.198 185.63 66.258
## - wt 1 69.043 222.48 72.051
##
## Step: AIC=61.52
## mpg ~ hp + wt + qsec + am
##
## Df Sum of Sq RSS AIC
## - hp 1 9.219 169.29 61.307
## <none> 160.07 61.515
## - qsec 1 20.225 180.29 63.323
## - am 1 25.993 186.06 64.331
## - wt 1 78.494 238.56 72.284
##
## Step: AIC=61.31
## mpg ~ wt + qsec + am
##
## Df Sum of Sq RSS AIC
## <none> 169.29 61.307
## - am 1 26.178 195.46 63.908
## - qsec 1 109.034 278.32 75.217
## - wt 1 183.347 352.63 82.790
##
## Call:
14
## lm(formula = mpg ~ wt + qsec + am, data = mtcars)
##
## Coefficients:
## (Intercept) wt qsec am
## 9.618 -3.917 1.226 2.936
# Code block 2
null <- lm(mpg~1)
step(null,scope =list(upper=full),direction="forward",k=log(32))
## Start: AIC=117.41
## mpg ~ 1
##
## Df Sum of Sq RSS AIC
## + wt 1 847.73 278.32 76.149
## + cyl 1 817.71 308.33 79.426
## + disp 1 808.89 317.16 80.329
## + hp 1 678.37 447.67 91.358
## + drat 1 522.48 603.57 100.919
## + vs 1 496.53 629.52 102.267
## + am 1 405.15 720.90 106.604
## + carb 1 341.78 784.27 109.300
## + gear 1 259.75 866.30 112.483
## + qsec 1 197.39 928.66 114.708
## <none> 1126.05 117.409
##
## Step: AIC=76.15
## mpg ~ wt
##
## Df Sum of Sq RSS AIC
## + cyl 1 87.150 191.17 67.595
## + hp 1 83.274 195.05 68.237
## + qsec 1 82.858 195.46 68.306
## + vs 1 54.228 224.09 72.680
## + carb 1 44.602 233.72 74.026
## + disp 1 31.639 246.68 75.753
## <none> 278.32 76.149
## + drat 1 9.081 269.24 78.553
## + gear 1 1.137 277.19 79.484
## + am 1 0.002 278.32 79.614
##
## Step: AIC=67.6
## mpg ~ wt + cyl
##
## Df Sum of Sq RSS AIC
## <none> 191.17 67.595
## + hp 1 14.5514 176.62 68.528
## + carb 1 13.7724 177.40 68.668
## + qsec 1 10.5674 180.60 69.241
## + gear 1 3.0281 188.14 70.550
## + disp 1 2.6796 188.49 70.609
## + vs 1 0.7059 190.47 70.943
## + am 1 0.1249 191.05 71.040
## + drat 1 0.0010 191.17 71.061
15
##
## Call:
## lm(formula = mpg ~ wt + cyl)
##
## Coefficients:
## (Intercept) wt cyl
## 39.686 -3.191 -1.508
16
Part V
library(lme4)
## Warning: package 'lme4' was built under R version 4.2.2
## Loading required package: Matrix
## Warning: package 'Matrix' was built under R version 4.2.2
model5 <- lmer(mpg~cyl+hp+(1|am))
summary(model5)
## Linear mixed model fit by REML ['lmerMod']
## Formula: mpg ~ cyl + hp + (1 | am)
##
## REML criterion at convergence: 163.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.71357 -0.62706 -0.08133 0.50456 2.19282
##
## Random effects:
## Groups Name Variance Std.Dev.
## am (Intercept) 6.781 2.604
## Residual 7.877 2.807
## Number of obs: 32, groups: am, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 33.28911 2.96130 11.241
## cyl -1.25266 0.62163 -2.015
## hp -0.03492 0.01439 -2.427
##
## Correlation of Fixed Effects:
## (Intr) cyl
## cyl -0.670
## hp 0.377 -0.851
ranef(model5)
## $am
## (Intercept)
## 0 -1.736842
## 1 1.736842
##
## with conditional variances for "am"
drop1(model5,test="Chisq")
## Single term deletions
##
## Model:
## mpg ~ cyl + hp + (1 | am)
## npar AIC LRT Pr(Chi)
## <none> 168.67
## cyl 1 171.25 4.5812 0.03232 *
17
## hp 1 171.06 4.3945 0.03605 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(lme4)
model6 <- lmer(mpg~cyl+hp+(1|am),REML=FALSE)
summary(model6)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: mpg ~ cyl + hp + (1 | am)
##
## AIC BIC logLik deviance df.resid
## 168.7 176.0 -79.3 158.7 27
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.74628 -0.75289 -0.09947 0.49059 2.31757
##
## Random effects:
## Groups Name Variance Std.Dev.
## am (Intercept) 2.677 1.636
## Residual 7.401 2.720
## Number of obs: 32, groups: am, 2
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 33.76715 2.48810 13.571
## cyl -1.38633 0.58933 -2.352
## hp -0.03283 0.01381 -2.378
##
## Correlation of Fixed Effects:
## (Intr) cyl
## cyl -0.752
## hp 0.414 -0.848
ranef(model6)
## $am
## (Intercept)
## 0 -1.507437
## 1 1.507437
##
## with conditional variances for "am"
18