BDA Lecture 11a
BDA Lecture 11a
1 / 22
Variable selection with projpred
1 / 22
Variable selection with projpred
1 / 22
Variable selection with projpred
1 / 22
Use of reference models in model selection
• Background
• First example
• Bayesian and decision theoretical justification
• More examples
2 / 22
Not a novel idea
3 / 22
Not a novel idea
3 / 22
Not a novel idea
3 / 22
Not a novel idea
3 / 22
Example: Simulated regression
f ∼ N(0, 1),
y | f ∼ N(f , 1)
2 ●
●
●
1 ●
● ●
● ●
●
0 ●
●
●●
● ●● ●
y
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
f
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ●●
●● ● ●●
●
y
●
●
●
●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ●
●
●● ● ●● ●
y
●
●
● ●
−1 ● ●
● ●
●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ●
● ●
● ●
●●
●
y
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
●●
● ●
●
0 ● ●●
●● ●
●
y
●
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ●
● ● ● ●
●
●
y
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
●
●
●
0 ● ● ● ●
●
●
●
y
●
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ● ● ●●
●
● ●
y
●
●
●
●●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Simulated regression
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
2 ●
●
●
1 ●
● ●
● ●
●
0 ●
●
●
●●
● ●
y
●
●
● ●
−1 ● ●
● ● ●
●
−2
● ●
−3 −2 −1 0 1 2 3
x[,?]
4 / 22
Example: Individual correlations
√
f ∼ N(0, 1), xj | f ∼ N( 𝜌f , 1 − 𝜌), j = 1, . . . , 150 ,
y | f ∼ N(f , 1) xj | f ∼ N(0, 1), j = 151, . . . , 500 .
Correlation for xj , y
1.00
0.75
●
●
●●
● ●
●
|R(xj, y)|
●● ●
● ●
● ● ● ● ●●● ●● ●
●● ● ●● ● ● ● ● ●●
●● ● ●
● ● ● ●
0.50 ● ●
● ● ●● ●
● ●
●
● ●● ●
●
● ●●
● ● ●●● ●
●● ●
●
● ● ● ● ●●
● ●● ● ●●
●● ● ●●
●● ●● ●● ●●● ● ●● ●
●
● ● ●●●●
● ● ● ● ●
●
●● ● ●● ● ● ●
●
●
● ● ●
●● ● ● ●● ●● ●● ● ● ● ●
● ●● ● ● ● ● ●
●● ● ● ● ●● ● ● ● ● ● ●
● ● ●
● ●●● ● ●● ●● ●● ● ●
●● ● ● ● ● ● ● ●
● ● ● ● ● ●● ● ● ●
●
0.25 ● ●
●
●
●
●●
●● ● ●
● ● ● ● ●● ●
● ●
● ●
● ●● ● ●
●●
●
●
● ● ● ●
●●
● ●
●● ● ●●
● ● ●● ● ● ●● ● ●● ● ●● ● ●●● ●
● ● ● ● ● ● ●●● ● ●
● ● ● ●●
●●● ●
● ● ● ●● ● ●● ● ● ● ●
● ● ●●● ● ●● ●● ● ● ●●●●●
●●
●● ●● ● ● ● ●●●● ●●
●● ● ● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ●● ●
● ● ● ●● ● ● ● ● ● ●
● ● ●
● ●● ● ● ● ●● ●● ●● ●
●● ● ● ●
● ● ● ● ● ●● ●● ● ●●● ● ● ●●● ● ● ● ● ● ●● ● ● ●●
●●●●●
● ●●● ● ●
●● ● ●● ● ● ●
● ●● ●● ●●
●● ●
● ●
● ●● ●● ●● ● ● ●●● ● ●●● ●
●
●● ●● ● ●●● ●●● ●● ●●
0.00 ●●●● ●● ● ● ● ●●
Correlation for xj , y
1.00
0.75
●
●
● ● ●
● ●
|R(xj, y)|
● ●
● ● ● ●
● ● ●● ● ● ● ● ● ● ● ●
●
● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●
0.50 ● ●
●
●●● ●
● ●●
●
●
● ●
● ●
●
● ● ●● ●● ●
●
● ●
●
●● ●● ●● ● ● ● ● ●
●
● ● ● ●
● ●● ●● ●
● ● ●●● ● ●
●
● ● ● ● ●● ●
● ●● ● ● ● ●●
● ● ●● ●●
● ●
● ●● ● ● ●
● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●● ● ● ●
● ●● ● ●
●● ●
●
●● ● ● ● ● ● ●● ●● ●
●
●●● ●● ●
● ● ●● ●● ● ●
● ●● ●● ● ●● ● ●● ● ●● ●
0.25 ●
● ●● ●
●●
● ●●
●● ● ● ● ●
●
●
●
●
●●
● ●
● ●
●
●● ● ● ● ●
●
●●● ● ● ●●●●● ●● ●● ● ● ● ● ● ● ● ● ●
●●
● ●
●● ● ● ● ● ● ● ● ●
● ● ● ●●● ●
● ●
●
● ●
● ● ● ● ●● ● ●●
● ● ● ●
● ●
● ● ●● ● ● ●● ● ●
● ●● ●● ●
● ● ●● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
●
●
● ● ● ● ● ●
● ● ● ● ●
● ●● ● ● ● ● ● ●● ● ● ● ●
●● ● ● ● ● ●
● ● ● ● ●●● ● ● ● ●● ● ●● ●
●
●●
● ● ●● ●● ● ● ●● ●● ●● ●● ●● ● ●● ●
●● ● ●● ● ● ● ●●● ●
● ● ●● ● ● ● ●● ●● ● ● ● ●●●
●● ● ● ● ●● ● ●● ●● ● ● ● ●● ● ● ●●●
● ●
●● ● ● ● ● ● ● ● ●
0.00 ●● ● ●● ● ● ●
Correlation for xj , f
1.00
●
●
● ●
● ● ●● ●
● ● ● ● ●● ● ●● ● ●
● ● ●
●
●● ●
●●● ● ● ●● ●
0.75 ●●● ●
●●
●
● ●
●
●
● ●
●●
● ●
● ●
●
●
● ● ●●
●
●
●
●
●●
● ●● ● ●● ●● ● ● ● ● ●
● ● ● ●● ● ● ●
● ● ● ● ● ●●
●
● ● ●
● ● ● ●
●● ●● ●● ●● ● ● ●
● ● ● ● ● ● ●● ●
|R(xj, f)|
● ● ● ●
● ●● ● ●
● ●
● ● ● ●
0.50 ● ●
● ●
● ●● ●
●
● ●
●
● ●
● ●
● ●
●● ● ●
● ● ●
●● ●
● ● ●
●
●● ● ● ●
● ● ● ● ●
● ● ● ● ● ●
● ●● ●● ● ● ● ●
● ● ● ● ● ●● ● ●
● ● ● ●
●●●● ●
●
0.25 ●
●● ●
●
●
●
● ●●
●
●●
● ● ●●
● ● ●
● ●
● ● ● ●
●
●
●● ●
●
●
●
● ●● ● ● ● ●● ●● ●● ● ● ● ●●
●● ● ● ● ● ● ● ● ●●
● ●
●
● ● ●● ●● ● ● ● ● ● ● ●● ●
●● ● ● ● ● ●
●●
●●● ●● ● ●
● ● ● ●●
● ● ●●● ● ●● ●● ● ●
●● ● ● ● ●● ●● ●
●●● ●● ● ● ● ●●●● ●●● ● ●
●● ●● ● ● ●● ●
● ● ● ● ● ●
● ● ● ● ● ● ●
● ● ●● ●● ● ●● ●
● ● ●● ●
● ● ●●
●● ●● ●● ● ●
●● ● ●
●●
● ● ● ●● ●●
● ● ●● ● ● ●
● ●●● ● ● ● ●● ●● ● ●● ●● ● ●
●● ●● ● ● ●● ●● ●
●● ●
● ●
● ●
● ● ● ● ● ● ●● ● ● ●● ● ●
● ●
●
● ● ● ●● ● ● ●
●
0.00 ● ● ● ● ●●
● ● ●● ●● ● ●●
●
●● ● ● ●
● ●●●
● ● ● ●
● ● ●
● ●●
● ● ●
0.50 ●
●
●●
●
●
● ● ●
●
● ●
●
● ● ● ● ●● ●
● ● ● ● ●
● ● ●● ● ● ●
● ●● ● ● ● ●
● ● ●● ●●
● ● ● ● ● ●
●
● ● ●
● ● ● ●
● ● ● ● ●
● ●● ● ● ● ● ●● ● ●
0.25 ●
● ● ● ●●
●
●
●
●
●
● ●
●● ●
●
● ● ● ●●
●
● ●
● ●
●
● ● ● ● ● ● ●
● ● ● ●● ● ● ●
● ● ● ● ● ●● ● ●●● ●● ● ●
● ● ● ● ● ●● ● ● ● ● ●
● ●● ● ● ● ●● ● ●
●● ●
● ● ●●● ● ●
●●
● ● ●
●● ●
● ●
● ● ●● ●
● ●● ● ●● ●
● ●●●
● ●●●● ● ● ● ●● ● ● ● ● ● ●● ● ●
● ● ●●●
●● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●
● ● ●
● ● ●
●● ●●●●● ● ●● ● ●● ● ●● ● ●● ●●● ● ●●
● ●●
●
● ● ● ●● ●
● ●● ● ● ● ● ●● ●●●●● ● ● ● ● ●
● ● ●● ● ● ● ●● ● ●
● ●● ● ● ●● ●●●● ● ●●● ●●●● ● ● ● ●● ● ●● ● ● ●●● ●
● ●
●
0.00 ● ● ● ● ● ● ● ● ● ●●
●
●● ●● ●●●● ● ●● ●●
●
●● ●● ●●●
●● ● ●● ●
● ●●● ● ●●
● ●●●● ●● ●
●●●● ●●●● ●●●●● ●
● ●● ● ●
●●● ● ●
●● ● ● ●●●
0.75 ●
● ● ●●●●
● ● ● ●● ●
●●
●●
●● ●
●
●● ●
●● ●
●
●
● ●●
●● ● ●
●
● ●
● ● ●●● ●
● ●● ●●●●
●● ● ●●●●● ●
●●● ●
●
● ● ●
●
|R(xj,f)|
● ● ●
●
0.50 ●●
● ●
●
● ●●
●
● ● ●● ● ●
● ●● ●
● ●● ● ●
●● ● ● ● ● ●
● ●● ● ● ●●● ●● ●
● ● ● ● ●● ●
●
●
●●
●
0.25 ● ●● ●● ●
●
●
●● ● ● ●● ●●●●●●● ●
●
●●● ● ● ●● ●● ● ●
● ●
●
●
●
●●● ● ●●
●
● ●●●●● ●●●●●● ● ●
●
●●●●●●● ● ● ●●● ●●● ●● ●
●● ●● ●●
●
●●●
●● ●● ● ●●●● ●● ●● ●
● ●
●●●●●
● ● ●●● ●●●
● ●● ●● ●
●● ●
● ●●
●
● ●●●●●● ●●● ●● ●● ●
●
●● ●● ●●● ●●●
● ●●●● ●●●● ●
● ●● ● ●●● ●●
● ● ● ●●● ●
●●●●●
● ● ●●
●● ●●●●
●
● ● ●● ●
●●●●●●●●●
●
●●●
●
● ●
●●● ●●●●● ● ● ● ● ●
● ●
●● ● ●
●● ●●●
●●● ● ●●●●● ● ●
0.00 ●
● ●
●●●
● ●
irrelevant xj , relevant xj
A) Sample correlation with y vs. sample correlation with f
6 / 22
Estimating the latent values with a reference model helps
●
●● ●● ●●●● ● ●● ●● ●
●
●
●● ●● ●●● ●
● ●● ●● ●
●● ● ●● ●
● ●●● ● ●● ● ●●●
●●● ●● ● ● ●● ● ●● ●
● ● ●● ● ●●●●●●●● ● ●●● ● ●●● ● ●
● ●●
●● ● ● ● ● ●●● ●● ●●●●● ●
●●● ● ●
●● ● ● ●●● ● ● ● ● ●● ●●●
●
0.75 ●
● ● ●●●●
● ● ● ●● ●
●●
●●
●● ●
●
●● ●
●● ●
●
●
● ●●
●● ● ●
●
● ●
0.75 ●
●●● ●
● ●
●
● ●●●●● ●●
●● ● ●
● ●
●●● ●● ●●● ●
●●
●
●
●
● ● ●●● ● ● ●●●●●
●● ●● ● ●●● ●
● ●● ●●●●
●● ● ●●●●● ● ●
● ● ●● ● ●●
●
●●● ● ●●● ● ● ● ●● ●●
● ●
●● ●●●●
● ● ● ● ●● ● ● ●
● ●●
|R(xj,f∗)|
● ● ●
|R(xj,f)|
● ● ● ●
● ● ● ●
●
0.50 ●●
● ●
0.50 ●
●
● ● ● ●
● ●● ● ●●● ●● ●●
●
● ● ●● ● ● ●
●● ● ●● ● ● ● ●●
● ●
●● ● ● ● ● ● ●● ● ● ●
●● ●● ●● ●
●●●●
●● ● ● ● ● ● ●●●
●
● ●
● ●● ● ● ●●● ●● ● ● ●
● ● ● ● ●● ●
● ●● ●
● ● ● ● ●●
● ● ● ● ●●●●● ●●
0.25 ● ●● ●● ●
●
●
●● ● ● ●● ●●●●●●● ●
●
●●● ●● ●● ● ●
● ●
●
●
● 0.25 ●●●
●●● ●
●●●●
● ●
●●
●●● ● ● ●
● ●
●
●●●●● ●
●
●
●● ●
●
● ●● ●● ● ●
●
● ● ●
● ● ●● ● ●● ●● ● ● ●● ●
●● ●●●●● ●●●●●● ● ● ●
● ● ●● ●● ● ●
●●●●●●● ● ● ●●● ●●● ●● ●
● ●●
● ●● ●●●● ● ●●●
●● ●● ●●
●
●●●
●● ●● ● ●●●● ●● ●● ● ●●●●●
●
● ●●●●● ● ● ●
●●
●● ●●● ● ●
●●●
●
●
●●● ●●● ●
●● ●● ●● ●●
● ●● ● ●
● ● ●● ● ●●
●● ●●●
● ●●●●●● ●● ● ● ●●●●●●●●●●
●●
●●● ● ● ●
●● ●
●●●●●
●● ●●
● ●
●●●● ●●
● ●●●●●
●
●
●●
● ●
●●●● ●
●
●● ●●
●
●
●●● ●● ● ●
●
● ●● ●●●● ●
●●
● ● ● ● ● ●●
● ● ●
●●●●●
● ● ●
●●
●● ●●●●
●
● ●● ● ●
●● ●●● ● ●
●
●●
●●
●
●
● ●●
●
●
● ●
●●●●●● ●● ●
●●●●● ●●
●●●●● ● ●
●●●
● ●●● ●●●●● ● ● ● ●
●● ●●
●
●
●●●● ●●●● ●
●●● ●●
● ● ●
●●●
●●● ●
● ● ● ●●●●●●●●●●●●● ●● ●● ●● ●
●●
● ●
●●●
●●● ● ●●●●● ● ● ●●● ●●
●● ●
●● ● ●●●
0.00 ●
● ●
●●●
● ●
0.00 ●
● ●
irrelevant xj , relevant xj
A) Sample correlation with y vs. sample correlation with f
B) Sample correlation with y vs. sample correlation with f∗
f∗ = linear regression fit with 3 principal components
6 / 22
Bayesian justification
7 / 22
Bayesian justification
7 / 22
Bayesian justification
7 / 22
Bayesian justification
7 / 22
Bayesian justification
7 / 22
Bayesian justification
7 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Logistic regression with two covariates
Posterior Predictions
2
20
1
15
10
β2
x2
0
5
-1
0
-2
0 5 10 15 20 -2 -1 0 1 2
β1 x1
8 / 22
Predictive projection
9 / 22
Predictive projection
9 / 22
Predictive projection
9 / 22
Predictive projection
9 / 22
Projective selection
• How to select a feature combination?
10 / 22
Projective selection
• How to select a feature combination?
• For a given model size, choose feature combination with
minimal projective loss
10 / 22
Projective selection
• How to select a feature combination?
• For a given model size, choose feature combination with
minimal projective loss
• Search heuristics, e.g.
• Monte Carlo search
• Forward search
• L1 -penalization (as in Lasso)
10 / 22
Projective selection
• How to select a feature combination?
• For a given model size, choose feature combination with
minimal projective loss
• Search heuristics, e.g.
• Monte Carlo search
• Forward search
• L1 -penalization (as in Lasso)
• Use cross-validation to select the appropriate model size
• need to cross-validate over the search paths
10 / 22
Projective selection vs. Lasso
Same simulated regression data as before,
n = 50, p = 500, prel = 150, 𝜌 = 0.5
Reference model
2.00
●
Lasso
Mean squared error
1.75
●
1.50 ●
●
● ●
● ●
● ●
● ● ●
● ● ● ● ● ● ●
1.25
0 5 10 15 20 25
Number of covariates
11 / 22
Projective selection vs. Lasso
Same simulated regression data as before,
n = 50, p = 500, prel = 150, 𝜌 = 0.5
Reference model
2.00
●
Lasso
Lasso, relaxed
Mean squared error
1.75
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
1.50 ●
●
●
●
●
● ●
●
●
●
●
●
● ●
● ● ●
● ●
● ● ● ●
● ● ● ● ● ● ● ●
1.25
0 5 10 15 20 25
Number of covariates
11 / 22
Projective selection vs. Lasso
Same simulated regression data as before,
n = 50, p = 500, prel = 150, 𝜌 = 0.5
Reference model
Lasso
2.00
●
Lasso, relaxed
Projection
Mean squared error
1.75
●
●
●
●
●
● ●
● ●
●
●
●
●
●
●
●
● ●
1.50 ●
●
●
●
●
● ●
●
● ●
●
●
●
● ●
● ● ●
● ●
● ● ● ●
● ● ● ● ● ● ● ●
●
1.25 ●
●
●
●
●
●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ●
0 5 10 15 20 25
Number of covariates
11 / 22
Projective selection vs. Lasso
Same simulated regression data as before,
n = 50, p = 500, prel = 150, 𝜌 = 0.5
Reference model
Lasso
2.00
●
● ● ● ●
Lasso, relaxed ● ● ● ● ● ● ●
●
−1.5 ● ●
●
●
●
●
●
●
1.75
●
●
● ● ●
−1.6
●
●
● ●
● ● ●
●
●
●
● ●
● ● ●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
● ●
● ● ● ●
● ●
● ●
1.50 ●
●
●
−1.7 ●
●
●
●
● ●
●
● ●
●
● ● ●
●
●
● ●
● ●
● ● ●
● ●
● ● ● ●
● ● ● ● ● ● ● ●
● ●
●
1.25 ●
● −1.8
●
●
●
●
●
● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ●
0 5 10 15 20 25 0 5 10
● 15 20 25
Number of covariates Number of covariates
11 / 22
Bodyfat: small p example of projection predictive
12 / 22
Bodyfat: small p example of projection predictive
abdomen
forearm
biceps
weight
height
chest
ankle
thigh
knee
neck
wrist
age
hip
siri
1
siri ● ● ● ● ●●● ● ● ● ● ● ●
●
age ●● ● ●
● ● ●
● ● ●
● ● ● ● 0.8
weight ● ● ● ● ●●●● ● ● ● ● ●
●
0.6
height ●
● ●● ● ● ● ● ● ● ● ● ● ●
neck ● ● ● ●● ● ● ● ● ● ● ● ●
● 0.4
chest ● ● ● ● ● ●● ● ● ● ● ● ● ● 0.2
abdomen ● ● ● ● ● ●● ● ● ● ● ● ● ● 0
hip ● ● ● ● ● ●●● ● ● ● ● ●
●
thigh ● ● ● ● ● ● ● ●● ● ● ● ● ● −0.2
knee ● ● ● ● ● ● ● ●● ● ● ● ●
●
−0.4
ankle ● ● ● ● ● ● ● ● ●● ● ● ●
●
biceps ● ● ● ● ● ● ● ● ● ● ●● ●
●
−0.6
forearm ● ● ● ● ● ● ● ● ● ● ●● ●
●
−0.8
wrist ● ● ● ● ● ● ● ● ● ● ● ● ●●
−1
12 / 22
Bodyfat
age
weight
height
neck
chest
abdomen
hip
thigh
knee
ankle
biceps
forearm
wrist
−5 0 5 10
13 / 22
Bodyfat
4
weight
−4
−2 −1 0 1
height
14 / 22
Bodyfat
0 ● ● ●
●
●
● ● ● ● ● ● ●
●
Difference to the baseline
−50
elpd
−100
−150 ●
4 ●
rmse
2
1
●
● ● ●
0 ● ● ● ● ● ● ● ● ●
0 3 6 9 12 15
Number of variables in the submodel
15 / 22
Bodyfat
age
weight
height
neck
chest abdomen
abdomen
hip
thigh
knee
ankle
weight
biceps
forearm
wrist
−5 0 5 10 −5 0 5 10
16 / 22
Predictive performance vs. selected variables
• The initial aim: find the minimal set of variables providing similar
predictive performance as the reference model
17 / 22
Predictive performance vs. selected variables
• The initial aim: find the minimal set of variables providing similar
predictive performance as the reference model
• Some keep asking can it find the true variables
17 / 22
Predictive performance vs. selected variables
• The initial aim: find the minimal set of variables providing similar
predictive performance as the reference model
• Some keep asking can it find the true variables
• What do you mean by true variables?
abdomen
forearm
biceps
weight
height
chest
ankle
thigh
knee
neck
wrist
age
hip
siri
1
siri ● ● ● ● ●●● ● ● ● ● ● ●
●
age ●● ● ●
● ● ●● ● ●
●
● ● ● 0.8
weight ● ● ● ● ●●●● ● ● ● ● ●
●
0.6
height ●
● ●● ● ● ● ● ● ● ● ● ● ●
neck ● ● ● ●● ● ● ● ● ● ● ● ●
● 0.4
chest ● ● ● ● ● ●● ● ● ● ● ● ● ● 0.2 abdomen
abdomen ● ● ● ● ● ●● ● ● ● ● ● ● ● 0
hip ● ● ● ● ● ●●● ● ● ● ● ●
●
thigh ● ● ● ● ● ● ● ●● ● ● ● ● ● −0.2
knee ● ● ● ● ● ● ● ●● ● ● ● ●
●
−0.4
ankle ● ● ● ● ● ● ● ● ●● ● ● ●
●
weight
biceps ● ● ● ● ● ● ● ● ● ● ●● ●
●
−0.6
forearm ● ● ● ● ● ● ● ● ● ● ●● ●
●
−0.8
wrist ● ● ● ● ● ● ● ● ● ● ● ● ●●
−1
−5 0 5 10
17 / 22
Variability under data perturbation
75%
50% projpred
steplm
25%
0%
abdomen weight wrist height age neck chest biceps thigh ankle forearm hip knee
18 / 22
Variability under data perturbation
75%
50% projpred
steplm
25%
0%
abdomen weight wrist height age neck chest biceps thigh ankle forearm hip knee
18 / 22
Variability under data perturbation
75%
50% projpred
steplm
25%
0%
abdomen weight wrist height age neck chest biceps thigh ankle forearm hip knee
19 / 22
Variability under data perturbation
75%
50% projpred
steplm
25%
0%
abdomen weight wrist height age neck chest biceps thigh ankle forearm hip knee
19 / 22
Variability under data perturbation
75%
50% projpred
steplm
25%
0%
abdomen weight wrist height age neck chest biceps thigh ankle forearm hip knee
19 / 22
Multilevel regerssion and GAMMs
20 / 22
Scaling
21 / 22
Intro paper and brms and rstanarm + projpred examples
22 / 22