Section 3
Section 3
Craig Anderson
The general linear mixed model
2/158
Example
yij = αi + bj + eij
3/158
Example
Model components
y11
y12
α1
y21 ; β = α2 ; u = b1 .
y=
y22 b2
α3
y31
y32
4/158
Example
Model components
1 0 0 1 0
1 0 0
0 1
0 1 0 1 0
X= ; Z = .
0 1 0
0 1
0 0 1 1 0
0 0 1 0 1
5/158
Normal model
or
y = Xβ + e∗
where e∗ = Zu + e.
6/158
Variance terms in the example
σE2 0 ... 0
0 σ2 ... 0
σB2 0 E
G= σB2 I2 = ; R = σ 2
I =
0 σB2 E 6
...
2
0 ... 0 σE
7/158
Maximum likelihood for fixed effects
8/158
Variance component estimation
9/158
Maximum likelihood for variance matrices
The maximum likelihood estimate of
y ∼ N(Xβ, V).
10/158
Maximum likelihood for variance matrices
For any fixed V, l(β, V) is maximised over β by
β̃ = (X| V−1 X)−1 X| V−1 y.
1 1
= − log |V| − y| V−1 [I − X(X| V−1 X)−1 X| V−1 ]y
2 2
+ const
12/158
Variance component estimation
Simple example
iid
Consider X1 , . . . , Xn ∼ N(µ, σ 2 ).
13/158
Restricted Maximum Likelihood (REML)
14/158
Restricted Maximum Likelihood (REML)
REML
Starting with
y = Xβ + Zu + e
Then
k| · y = k| · Xβ + k| · Zu + k| · e
Ky = (KZ)u + (Ke).
15/158
Restricted Maximum Likelihood (REML)
REML
Thus
16/158
Residual Maximum Likelihood
17/158
Examples
18/158
Examples
19/158
Examples
20/158
Prediction
21/158
Example
22/158
Example
23/158
Prediction
24/158
Simple prediction example
density of y
density of u
0.5
0.4
density
0.3
0.2
0.1
0.0
−10 −5 0 5 10
25/158
Simple example of prediction
E[(Ũ − U)2 ]
26/158
Best predictor
E kũ − uk2 .
The solution is
ũ = BP(u) = E(u|y).
27/158
Best linear prediction
ũ = Ay + b
where
28/158
Best linear prediction
u
If is multivariate normal, then best prediction and
y
best linear prediction coincide.
In particular,
29/158
BLP and the mixed model
In the mixed model
y = Xβ + Zu + e
we have
E(y) = Xβ
V = Var (y) = ZGZ| + R
C = E {[(u − E(u)][(y − E(y)]| }
= E[u(Zu + e)| ]
= E(uu| Z| ) + E(ue| )
= Var (u) Z| + 0 = GZ|
Therefore
ũ = BLP(u) = GZ| V−1 (y − Xβ).
30/158
BLP and the mixed model
The expression
31/158
Best linear unbiased prediction
32/158
Best linear unbiased prediction
33/158
Henderson’s justification
These assume
34/158
Henderson’s justification
35/158
Best linear unbiased prediction
where
0 0
D= X Z and B = .
0 G−1
36/158
Estimated or empirical BLUP
37/158
Estimated or empirical BLUP
38/158
Estimated or empirical BLUP
39/158
Standard error estimation
The variance of
is
Var β̃ = (X| V−1 X)−1 .
40/158
Precision of BLUPs involving u
where, as before
0 0
D= X Z and B =
0 G−1
41/158
Precision of BLUPs involving u
42/158
Summary
The solutions for the fixed effect yield best linear unbiased
estimators (BLUEs).
43/158
Summary
Properties of a BLUP
44/158
Summary
Properties of a BLUP
45/158
Toy Example
yij = µ + αi + bj + eij
where
yij is the breaking strength for the ith adhesive and jth toy,
i = 1, . . . , I (I = 3) and j = 1, . . . , J (J = 7).
µ is the overall mean.
αi is the fixed effect associated with the ith adhesive.
bj is the random effect associated with the jth toy (block).
eij is the experimental error associated with samples within
blocks.
46/158
BLUP for toy effect
σB2
b̃j = (ȳ·j − ȳ)
σB2 + σE2 /3
where ȳ·j is the average pressure value for the jth toy and ȳ
is the grand mean.
σB2
Because the factor is never greater than 1,
σB2 + σE2 /3
the BLUP can be thought of as a shrinkage estimator.
47/158
Hypothesis tests
β̂i − βi approx
∼ N(0, 1)
ese(β̂i )
48/158
Hypothesis tests
Wald test
49/158
Hypothesis tests
50/158
Likelihood ratio test for fixed effects
L(θ̂ 0 ; y)
LR(y) =
L(θ̂; y)
51/158
Likelihood ratio test for fixed effects
Here
52/158
Hypothesis tests
53/158
Hypothesis tests for random effects
54/158
Likelihood ratio tests for variance
55/158
Special case
56/158
Special case
57/158
Tests using bootstrap
58/158
Tests using bootstrap
Parametric Bootstrap
59/158
Example
Paper brightness
60/158
Pulp example
Model
yij = µ + ai + eij
where
yij is the paper brightness measured by the ith operator,
i = 1, . . . , 4 with j = 1, . . . , 5 replicates per operator.
µ is the overall mean
ai is the random effect associated with the ith operator
eij is the experimental error.
61/158
Data
pulp
bright operator
1 59.8 a
2 60.0 a
3 60.8 a
4 60.8 a
5 59.8 a
6 59.8 b
...
18 60.6 d
19 60.5 d
20 60.5 d
62/158
Inference using ANOVA decomposition
Fixed effects:
Estimate Std. Error t value
(Intercept) 60.4000 0.1294 466.7
64/158
Likelihood ratio test
as.numeric(2*(logLik(smod)-logLik(nullmod)))
[1] 2.568371
pchisq(2.5684,1, lower=FALSE)
[1] 0.1090179
65/158
Fitting a mixed model using REML
library(lme4)
mmod <- lmer(bright ˜ 1+(1|operator), data=pulp)
summary(mmod)
Fixed effects:
Estimate Std. Error t value
(Intercept) 60.4000 0.1494 404.2
66/158
Parametric bootstrap
# p-value:
mean(lrstat >2.5684)
[1] 0.02
67/158
Pulp example
68/158
Prediction
69/158
Prediction of the random effects
#EBLUPs:
fixef(mmod)+ranef(mmod)$operator
(Intercept)
a 60.27806
b 60.14087
c 60.56767
d 60.61340
70/158
Residuals
Because we can have different fitted values we end up with
more than one type of residual. In the example resid(mmod)
gives residuals as follows:
round(resid(mmod),5)
[1] -0.47806 -0.27806 0.52194 0.52194 -0.47806
[6] 0.34088 0.05912 0.25912 -0.24088 -0.14088
[11] 0.13233 0.13233 -0.06767 0.33233 -0.26767
[16] 0.38660 0.18660 -0.01340 -0.11340 -0.11340
pulp$bright-resid(mmod)
[1] 60.27806 60.27806 60.27806 60.27806 60.27806
[6] 60.14087 60.14087 60.14087 60.14087 60.14087
[11] 60.56767 60.56767 60.56767 60.56767 60.56767
[16] 60.61340 60.61340 60.61340 60.61340 60.61340
● ● ●
0.4
0.4
● ●
● ●
Sample Quantiles
● ●
0.2
0.2
● ●
Residuals
●● ●
● ●
0.0
0.0
● ●
● ●
●● ●
● ●
−0.4 −0.2
−0.4 −0.2
● ●
●● ● ●
● ●
● ● ●
72/158
Diagnostic plots for pulp data
73/158
Mixed models for split-plot designs
Split-plot design
74/158
Mixed models for split-plot designs
75/158
Advantages of split-plot designs
76/158
Disdvantages of split-plot designs
77/158
Example
Water resistance
78/158
Example
Water resistance data
79/158
Example
A split-plot design
80/158
Example
Quiz
Which of the following factors in the model are fixed and which
random?
wood: the identification number of each wood panel in the
study;
pretrt: pretreatment (A or B) applied to the wood panel;
stain: types of stains (1, 2 ,3, or 4) applied to the smaller
piece of wood.
81/158
Model
library(lme4)
m2 <- lmer(resistance ˜ pretrt+stain
+ (1|wood), data=woodres)
summary(m2)
83/158
R output
Random effects:
Groups Name Variance Std.Dev.
wood (Intercept) 0.81245 0.90136
Residual 0.81566 0.90314
Number of obs: 56, groups: wood, 14
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.9646 0.4346 13.724
pretrtB 1.3050 0.5389 2.422
stain2 -0.3807 0.3414 -1.115
stain3 -0.9064 0.3414 -2.655
stain4 -1.9714 0.3414 -5.775
84/158
Fixed effects estimates
85/158
Types of explanatory variables
86/158
Analysis of covariance
87/158
Example
Clinical trial for blood pressure drugs
88/158
Possible scenarios
the slopes and intercepts for the treatments are the same
the slopes are different, but the intercepts are the same
the slopes and intercepts are different
the intercepts are different, but the slopes are the same
the intercepts are different, but all slopes are zero (a
special case of the previous scenario)
89/158
Possible scenarios
BP Change
BP Change
BP Change
Baseline BP Baseline BP Baseline BP
BP Change
BP Change
Drug 1
Drug 2
Baseline BP Baseline BP
90/158
Example
Silicon wafers
91/158
Silicon Wafer Example
Data
92/158
Silicon Wafer Example
Variables in wafer4
93/158
Model
where
yijk is the deposition rate for the kth site from the jth wafer
assigned to the ith temperature, i = 1, 2, 3, j = 1, . . . , 8
and k = 1, 2, 3;
β0 is the overall intercept;
αi is the coefficient for the ith temperature effect on the
intercept;
β1 is the overall slope;
δi is the coefficient for the ith temperature effect on the
slope;
94/158
Model
Also
xij is the thickness measured on the jth wafer assigned to
the ith temperature.
2
wj(i) is the random effect for wafer, assumed i.i.d. N(0, σW )
(wafer effect nested within temperature);
eijk is the site effect, assumed i.i.d. N(0, σE2 ).
95/158
Data
96/158
Exploratory Plot
97/158
Initial Impressions
98/158
Fitting the mixed model in R
Random effects:
Groups Name Variance Std.Dev.
wafer (Intercept) 132.536 11.512
Residual 4.194 2.048
99/158
Fitting the mixed model in R
Fixed effects:
Estimate Std. Error t value
(Intercept) 114.40145 63.83150 1.792
temp1000 -141.12757 89.05137 -1.585
temp1100 84.67028 114.31343 0.741
thick 0.09970 0.03196 3.120
temp1000:thick 0.06371 0.04529 1.407
temp1100:thick -0.05879 0.05774 -1.018
100/158
Comments on output
101/158
Output interpretation
102/158
Output interpretation
103/158
Fitted regression lines
104/158
Test of the interaction term
Fixed effects:
Estimate Std. Error t value
(Intercept) 114.40145 63.83150 1.792
temp1000 -141.12757 89.05137 -1.585
temp1100 84.67028 114.31343 0.741
thick 0.09970 0.03196 3.120
temp1000:thick 0.06371 0.04529 1.407
temp1100:thick -0.05879 0.05774 -1.018
105/158
Model with different intercepts, same slope
Random effects:
Groups Name Variance Std.Dev.
wafer (Intercept) 151.817 12.321
Residual 4.194 2.048
106/158
Fixed effects parameter estimates
Fixed effects:
Estimate Std. Error t value
(Intercept) 83.89769 43.89139 1.911
temp1000 -17.14673 6.33695 -2.706
temp1100 -30.79875 6.20972 -4.960
thick 0.11501 0.02191 5.249
107/158
Fitted regression lines
108/158
Adjusting for a covariate
109/158
Adjusted treatment means in R
110/158
Pairwise differences in means
111/158
Comments on the output
112/158
Random coefficient models
113/158
ANCOVA
+ β 4x
= α4 x
|x) 4 +β 2
µ(y α 2
|x) 2=
y µ (y
β x
= α 1+ 1
µ(y|x) 1
114/158
Random coefficient model
b 4x
a4+
+b 2x
a2
x
y a 3 +b 3 x
a 1 +b 1
115/158
Fixed vs random regression coefficients
ANCOVA graph
116/158
Fixed vs random regression coefficients
117/158
Example
Wheat
118/158
Wheat Example
Data
The wheat dataset contains the following variables:
id: the identification number for the plots;
variety: ten randomly selected varieties of winter
wheat;
moist: the amount of moisture measured before planting
the varieties on the plots;
yield: the yield of the plot in bushels per acre.
119/158
Yield vs moisture
80
70
variety
1
60 2
3
4
yield
5
6
7
50 8
9
10
40
30
20 40 60
moisture
120/158
Wheat Example
121/158
Model
where
yij is the yield for the ith variety in the jth plot,
i = 1, . . . , 10 and j = 1, . . . , 6;
xij is the moisture of the ith variety in the jth plot;
ai is the intercept for the ith variety. This is a random effect
because variety is a random effect.
bi is the slope for the ith variety. This is also a random
effect because variety is a random effect.
eij is the random error, assumed i.i.d. N(0, σE2 ).
122/158
Model
The fixed effects of the model are the intercept α and the
slope β.
123/158
Mixed model parameterisation
We have shown that
ai = α + a∗i
bi = β + b∗i
where
iid
a∗i ∼ N(0, σA2 ),
iid
b∗i ∼ N(0, σB2 )
Cov a∗i , b∗i = σAB .
124/158
Mixed model parameterisation
α + βxij
125/158
Random Slope, Random Intercept
Variance
σA2 σAB 1
+ σE2
Var (yij ) = 1 xij
σAB σB2 xij
Covariance
Cov yij , yik = Cov a∗i + b∗i xij + eij , a∗i + b∗i xik + eik
126/158
Wheat Data
id variety yield moist
1 1 41 10
2 1 69 57
3 1 53 32
4 1 66 52
5 1 64 47
6 1 64 48
7 2 49 30
8 2 44 21
...
59 10 67 48
60 10 74 59
127/158
Fitting the random coefficient model in R
Random effects:
Groups Name Variance Std.Dev. Corr
variety (Intercept) 18.8947 4.3468
moist 0.2394 0.4893 -0.34
Residual 0.3521 0.5933
Fixed effects:
Estimate Std. Error t value
(Intercept) 33.4339 1.3985 23.91
moist 6.6166 0.1678 39.42
128/158
Fixed effects in lmer
129/158
Random effects in lmer
130/158
Random effects in lmer
Until now, we have only used random effects of the form
(1|variety).
132/158
Output: fixed effects
133/158
Output: random effects
134/158
Output: random effects
The ranef() function provides estimates for each of our
random effect terms.
(Intercept) moist
1 0.9577955 -0.4921125
2 -2.2842770 -0.6669726
3 -0.4081197 0.6722278
4 0.6960210 -0.2330618
5 1.1159079 -0.1990372
6 4.6391469 0.2388880
7 -10.7300464 0.5642359
8 2.4011660 0.2243375
9 -0.1762124 0.2335679
10 3.7886182 -0.3420729
135/158
Output: random effects
We can use our estimates α̂, β̂, aˆ∗i and bˆ∗i to construct a
unique fitted line for each variety i.
136/158
Fitted lines for each variety
80
70
variety
1
60 2
3
4
yield
5
6
7
50 8
9
10
40
30
2 4 6
moisture 137/158
Likelihood ratio test
138/158
Likelihood ratio test
Models:
m2: yield ˜ moist + (1 | variety) + (0 + moist | variety)
m1: yield ˜ moist + (moist | variety)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m2 5 193.10 203.57 -91.548 183.10
m1 6 194.06 206.62 -91.028 182.06 1.0411 1 0.3076
139/158
Likelihood ratio test
We can carry out a similar test to see whether we can
remove the random slope from our model.
anova(m3,m2)
Models:
m3: yield ˜ moist + (1 | variety)
m2: yield ˜ moist + (1 | variety) + (0 + moist | variety)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m3 4 208.26 216.64 -100.129 200.26
m2 5 193.10 203.57 -91.548 183.10 17.162 1 3.432e-05 ***
140/158
Hierarchical linear models
141/158
Hierarchical linear models
142/158
Example
143/158
Test Score Example
Data
The mathscore dataset contains the following variables:
Gain: the test score gains on a mathematics achievement
test for each student;
PreTotal: the sum of some pretest core items for each
student;
Class: the class each student belongs to;
Tmastry: the percent of class mastering previous
curricula.
144/158
Test Score Example
Nested structure
145/158
Model at the student level
146/158
Model at the student level
where
yij is the gain for the ith student in the jth class,
i = 1, . . . , nj and j = 1, . . . , 159;
xij is the sum of pretest scores of the ith student in the jth
class;
aj is the intercept for the jth class. This is a random effect
because class is a random effect.
bj is the slope for the jth class. This is also a random effect.
eij is the random error, assumed i.i.d. N(0, σE2 ).
147/158
Model at the student level
The fixed effects of the model are the intercept α0 and the
slope β0 .
148/158
Model at the student level
We can rewrite this model as:
where
aj = α0 + a∗j and
bj = β0 + b∗j
with
iid
a∗j ∼ N(0, σA2 ),
iid
b∗j ∼ N(0, σB2 ) and
Cov a∗j , b∗j = σAB .
149/158
Model at the class level
aj = α0 + α1 zj + a∗j
bj = β0 + β1 zj + b∗j
150/158
Model at the class level
iid
a∗j ∼ N(0, σA2 ),
iid
b∗j ∼ N(0, σB2 ) and
Cov a∗j , b∗j = σAB .
151/158
Multilevel model
152/158
Mathscore data
153/158
Fitting the multilevel model in R
Random effects:
Groups Name Variance Std.Dev. Corr
Class (Intercept) 9.0284 3.005
PreTotal 0.7796 0.883 -0.82
Residual 21.6545 4.653
Fixed effects:
Estimate Std. Error t value
(Intercept) -1.494221 1.341034 -1.114
PreTotal -1.602810 0.652153 -2.458
Tmastry 1.131062 0.176961 6.392
PreTotal:Tmastry -0.006142 0.084758 -0.072
154/158
Output: fixed effects
Fixed effects:
Estimate Std. Error t value
(Intercept) -1.494221 1.341034 -1.114
PreTotal -1.602810 0.652153 -2.458
Tmastry 1.131062 0.176961 6.392
PreTotal:Tmastry -0.006142 0.084758 -0.072
155/158
Output: random effects
Random effects:
Groups Name Variance Std.Dev. Corr
Class (Intercept) 9.0284 3.005
PreTotal 0.7796 0.883 -0.82
Residual 21.6545 4.653
156/158
Likelihood ratio test
Models:
m3: Gain ˜ PreTotal + Tmastry + (1 | Class) +
m3: (0 + PreTotal | Class)
m2: Gain ˜ PreTotal + Tmastry + (PreTotal | Class)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m3 6 18681 18717 -9334.6 18669
m2 7 18673 18716 -9329.6 18659 9.8694 1 0.001681 **
157/158
EBLUPs for (some) random effects
(Intercept) PreTotal
1 -1.92675077 0.0119876249
2 -5.39003631 0.0915662399
3 -1.00595155 0.0094430992
4 -0.38172198 0.0321817612
5 1.58481753 -0.0650445905
6 0.13879856 -0.0187367181
7 -0.87685526 0.0031362848
8 -2.18841662 0.0299758090
158/158