Thanks to visit codestin.com
Credit goes to variani.github.io

May, 2016
bit.ly/1UiTZvQ

About this talk

  • Not really a talk (too long)
  • Like an R package vignette (as long as necessary)

About lme4qtl

lme4qtl \(=\) an extension of the lme4 R package


lme4qtl \(=\) linear mixed effects models for 4 quantitative trait loci mapping

About lme4qtl user

  1. You use mixed models for QTL mapping
    • Basic models (efficiency)
    • Advanced models (flexibility)
  2. You code in R
    • If not, just need to learn the formula interface
  3. You are a fan of the lme4 R package

Source: github.com/dmbates

Implementation

2-column Layout

  • Bullet 1
  • Bullet 2
  • Bullet 3
image.png

Results

QTL mapping examples using lme4qtl

Examples / Models

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Data preparation

  1. Write the formula of your model
  2. Write your data table into a data frame
  3. Compute the relation matrix across your grouping IDs
relmatLmer(
  myTrait ~ myCovariate + (1|myID),  # 1
  myData,                            # 2
  relmat = list(myID = myMatrix)     # 3
)

Description of the GAIT data

R objects Description
phen/phen2 A data frame with phenotypes and IDs (GAIT1/GAIT2)
dkin/dkin2 The double kinship matrix with IDs in row/columns names (GAIT1/GAIT2)

Description of the GAIT data (variables)

Variables in phen Type Project Description
ID character The individual's ID (must be unique)
HHID character The individual's house-hold ID
SEXf factor The gender (Male/Female)
AGE/AGEc/AGEsc numeric The age (raw/centered/scaled)
AGEc2/AGEsc2 numeric The age squared (centered/scaled)
APTT numeric GAIT1 Activated Partial Thromboplastin Time (units, sec)
Throm factor GAIT2 The thrombosis disease status (control/affected)

Models (1/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

APTT phenotype

Activated partial thromboplastin time (APTT) is a clinical test used to screen for coagulation-factor deficiencies.

Associations with KNG1, HRG, F11, F12, and ABO were confirmed in a meta-anaysis of 9,240 individuals (European ancestry) (Tang et al. 2012)

Polygenic model for APTT

m <- relmatLmer(APTT ~ AGEsc + (1|ID), phen, relmat = list(ID = dkin))
m
## Linear mixed model fit by REML ['lmerMod']
## Formula: APTT ~ AGEsc + (1 | ID)
##    Data: phen
## REML criterion at convergence: -805.8705
## Random effects:
##  Groups   Name        Std.Dev.
##  ID       (Intercept) 0.0873  
##  Residual             0.0423  
## Number of obs: 392, groups:  ID, 392
## Fixed Effects:
## (Intercept)        AGEsc  
##     0.96260     -0.03125

Residuals

r <- residuals(m)
qqnorm(r); qqline(r)

hist(r, breaks = 30)

Drop fixed effects

m <- relmatLmer(APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1|HHID) + (1|ID), 
  phen, relmat = list(ID = dkin), REML = FALSE)
drop1(m, test = "Chisq")
## Single term deletions
## 
## Model:
## APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1 | HHID) + (1 | ID)
##            Df     AIC    LRT Pr(Chi)    
## <none>        -815.93                   
## AGEsc       1 -748.67 69.254 < 2e-16 ***
## I(AGEsc^2)  1 -812.45  5.476 0.01928 *  
## SEXf        1 -816.87  1.059 0.30346    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Anova

m <- relmatLmer(APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1|HHID) + (1|ID), 
  phen, relmat = list(ID = dkin))
m0 <- update(m, . ~ . - (1|HHID))
anova(m, m0) 
## refitting model(s) with ML (instead of REML)
## Data: phen
## Models:
## m0: APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1 | ID)
## m: APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1 | HHID) + (1 | ID)
##    Df     AIC     BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m0  6 -817.93 -794.10 414.96  -829.93                        
## m   7 -815.93 -788.13 414.96  -829.93     0      1          1

Exact Restricted LRT (single effect)

m <- relmatLmer(APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1|ID), 
  phen, relmat = list(ID = dkin))
library(RLRsim)
exactRLRT(m)
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 105.16, p-value < 2.2e-16

Exact Restricted LRT (many effects)

m <- relmatLmer(APTT ~ AGEsc + I(AGEsc^2) + SEXf + (1|ID), 
  phen, relmat = list(ID = dkin))
m0 <- update(m, . ~ . + (1|HHID))  
library(RLRsim)
exactRLRT(m0, mA = m, m0 = m0)
## 
##  simulated finite sample distribution of RLRT.
##  
##  (p-value based on 10000 simulated values)
## 
## data:  
## RLRT = 0, p-value = 1

Models (2/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Thrombosis phenotype

Thrombosis is a common complex disease associated with substantial morbidity and mortality.

The major determinants of thrombosis include both environmental and genetic factors.

  • Enironmental: age & gender
  • Genetic: ABO blood group system (wikipedia)
    • Group O is protector

The contribution of genetic risk factors is estimated around 60% (Souto et al. 2000).

Polygenic model for Throm

m <- relmatGlmer(Throm ~ (1|ID), phen2, relmat = list(ID = dkin2), 
  family = binomial)
m
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: Throm ~ (1 | ID)
##    Data: phen2
##       AIC       BIC    logLik  deviance  df.resid 
##  955.1614  965.1910 -475.5807  951.1614      1111 
## Random effects:
##  Groups Name        Std.Dev.
##  ID     (Intercept) 1.264   
## Number of obs: 1113, groups:  ID, 1113
## Fixed Effects:
## (Intercept)  
##      -2.321

Covariates

m <- relmatGlmer(Throm ~ AGEsc + SEXf + ABOf3num + (1|ID), phen2, 
  relmat = list(ID = dkin2), family = binomial)
summaryCoef(m, signif.legend = TRUE)
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.7106     0.5375  -5.043 4.59e-07 ***
## AGEsc         1.3554     0.2117   6.401 1.54e-10 ***
## SEXf2         0.2792     0.2695   1.036  0.30023    
## ABOf3num     -0.5584     0.2160  -2.585  0.00973 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Disease prevalence 5%

K <- 0.05
dat <- mutate(phen2, offset = log(K/(1 - K)))

m <- relmatGlmer(Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial)
summaryCoef(m) # offset = log(K/(1 - K)) = -2.944439
##          Estimate Std. Error z value Pr(>|z|)    
## AGEsc      1.4220     0.1673   8.498   <2e-16 ***
## SEXfnum    0.3171     0.2643   1.200   0.2302    
## ABOf3num  -0.5662     0.2384  -2.374   0.0176 *

Probit link function

Modeling prevalence?

GAIT2 (118 vs. 817)

GAIT1 (53 vs. 340)

Models (3/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Polygenic model of PTES

m <- relmatGlmer(PTES ~ AGEsc + AGEsc2 + SEXf + (1|ID), phen2, 
  relmat = list(ID = dkin2), family = poisson)
m
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: PTES ~ AGEsc + AGEsc2 + SEXf + (1 | ID)
##    Data: phen2
##       AIC       BIC    logLik  deviance  df.resid 
## 10154.429 10178.627 -5072.215 10144.429       929 
## Random effects:
##  Groups Name        Std.Dev.
##  ID     (Intercept) 0.2697  
## Number of obs: 934, groups:  ID, 934
## Fixed Effects:
## (Intercept)        AGEsc       AGEsc2        SEXf2  
##     5.37879     -0.07799      0.01845      0.12344

Drop fixed effects

m <- relmatGlmer(PTES ~ AGEsc + AGEsc2 + SEXf + (1|ID), phen2, 
  relmat = list(ID = dkin2), family = poisson)
drop1(m, test = "Chisq")
## Single term deletions
## 
## Model:
## PTES ~ AGEsc + AGEsc2 + SEXf + (1 | ID)
##        Df   AIC     LRT   Pr(Chi)    
## <none>    10154                      
## AGEsc   1 10291 138.532 < 2.2e-16 ***
## AGEsc2  1 10166  13.798 0.0002035 ***
## SEXf    1 10220  67.163 2.499e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Models (4/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Association between APTT and AB0

dat <- subset(phen, !is.na(ab0))
m0 <- relmatLmer(APTT ~ AGEsc + (1|ID), dat, relmat = list(ID = dkin))
m <- update(m0, . ~. + ab0)
anova(m0, m)
## refitting model(s) with ML (instead of REML)
## Data: dat
## Models:
## m0: APTT ~ AGEsc + (1 | ID)
## m: APTT ~ AGEsc + (1 | ID) + ab0
##    Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)    
## m0  4 -792.18 -776.38 400.09  -800.18                             
## m   5 -801.47 -781.71 405.73  -811.47 11.286      1  0.0007808 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

lme4qtl is flexible

AB0 influences both mean and variance of APTT
*Not the same as in (Chen and Abecasis 2007)

AB0 fixed and random effects

dat <- subset(phen, !is.na(ab0))
m0 <- relmatLmer(APTT ~ AGEsc + (1|ID), dat, relmat = list(ID = dkin))
m1 <- update(m0, . ~. + ab0)
m2 <- update(m0, . ~. + ab0 + (1|ab0))
anova(m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: dat
## Models:
## m1: APTT ~ AGEsc + (1 | ID) + ab0
## m2: APTT ~ AGEsc + (1 | ID) + ab0 + (1 | ab0)
##    Df     AIC     BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## m1  5 -801.47 -781.71 405.73  -811.47                        
## m2  6 -799.47 -775.76 405.73  -811.47     0      1          1

Contribution of AB0 random effect

Additional variance explained by ABO is small, while df \(=\) 2

m2
## Linear mixed model fit by REML ['lmerMod']
## Formula: APTT ~ AGEsc + (1 | ID) + ab0 + (1 | ab0)
##    Data: dat
## REML criterion at convergence: -786.5521
## Random effects:
##  Groups   Name        Std.Dev.
##  ID       (Intercept) 0.085365
##  ab0      (Intercept) 0.004455
##  Residual             0.043433
## Number of obs: 384, groups:  ID, 384; ab0, 3
## Fixed Effects:
## (Intercept)        AGEsc          ab0  
##     0.95564     -0.03065      0.02563

lme4qtl is efficient

CPU time on GWAS

Model SOLAR (days) lme4qtl (days)
APTT ~ 1 + (1|ID) 1.2 1.6
APTT ~ AGE + SEX + (1|ID) 1.6 1.6
APTT ~ 1 + (1|HHID) + (1|ID) 5.6 1.7
APTT ~ AGE+ SEX + (1|HHID) + (1|ID) 8.2 1.7
  • Server with RAM 128G, 64 CPU \(\times\) 2.3G
  • SOLAR 7.6.6 (stable on March, 2015)
  • GAIT2: N = 934 individuals, M = 10M markers

Models (5/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

FXII phenotype

(Soria et al. 2002) showed that a locus in the F12 gene influences both

  • Coagulation Factor XII (FXII) activity
  • Susceptibility to thrombosis
Results of (a) linkage and (b) association mappings on Chromosome 5 for Factor FXII in the GAIT1 sample. The identified locus is the F12 gene. Figure source: (Ziyatdinov et al. 2015).

Locus F12 and related MIBD matrix

Locus F12 linked to APTT

dat <- mutate(phen, IBDID = ID)

m <- relmatLmer(APTT ~ AGEsc + (1|ID) + (1|IBDID), dat, 
  relmat = list(ID = dkin, IBDID = mibd), REML = FALSE)

m0 <- update(m, . ~ . - (1|IBDID))

# anova(m, m0)
(LOD <- (logLikNum(m) - logLikNum(m0)) / log(10))
## [1] 1.271161

Locus F12 linked to FXII

m <- relmatLmer(FXII ~ (1|ID) + (1|IBDID), dat, 
  relmat = list(ID = dkin, IBDID = mibd), REML = FALSE)

m0 <- update(m, . ~ . - (1|IBDID))

# anova(m, m0)
(LOD <- (logLikNum(m) - logLikNum(m0)) / log(10))
## [1] 9.326086

lme4qtl is flexible

Combined linkage-association model
*Aslmost the same model as in (Chen and Abecasis 2007)

Combined linkage-association model

dat <- mutate(subset(phen, !is.na(c46t)), IBDID = ID)
m <- relmatLmer(FXII ~ c46t + (1|ID) + (1|IBDID), dat, 
  relmat = list(ID = dkin, IBDID = mibd))
m0 <- update(m, . ~ . - c46t - (1|IBDID))

anova(m, m0)
## refitting model(s) with ML (instead of REML)
## Data: dat
## Models:
## m0: FXII ~ (1 | ID)
## m: FXII ~ c46t + (1 | ID) + (1 | IBDID)
##    Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)    
## m0  3 3739.8 3751.6 -1866.9   3733.8                             
## m   5 3592.0 3611.8 -1791.0   3582.0 151.76      2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Which of two genetic components?

dat <- mutate(subset(phen, !is.na(c46t)), IBDID = ID)

m <- relmatLmer(FXII ~ c46t + (1|ID) + (1|IBDID), dat, 
  relmat = list(ID = dkin, IBDID = mibd))

m0 <- update(m, . ~ . - c46t - (1|IBDID))

m1 <- update(m, . ~ . - c46t)
m2 <- update(m, . ~ . - (1|IBDID))

c46t (fixed effect) is the key player

anova(m, m0, m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: dat
## Models:
## m0: FXII ~ (1 | ID)
## m1: FXII ~ (1 | ID) + (1 | IBDID)
## m2: FXII ~ c46t + (1 | ID)
## m: FXII ~ c46t + (1 | ID) + (1 | IBDID)
##    Df    AIC    BIC  logLik deviance    Chisq Chi Df Pr(>Chisq)    
## m0  3 3739.8 3751.6 -1866.9   3733.8                               
## m1  4 3698.8 3714.6 -1845.4   3690.8  42.9508      1  5.614e-11 ***
## m2  4 3592.7 3608.5 -1792.4   3584.7 106.0915      0  < 2.2e-16 ***
## m   5 3592.0 3611.8 -1791.0   3582.0   2.7186      1    0.09918 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Models (6/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Sex-specificity for BMI (GAIT1)

# Common polygenic model
m0 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|HHID) + (1|ID), 
  phen, relmat = list(ID = dkin))

# Sex-specificity only in the residual variance
m1 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|ID) + (0 + SEXf|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 3)))

# Sex-specificity in both polygenic and residual variances  
m2 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (0 + SEXf|ID) + (0 + SEXf|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 5)))  

Examine the models' parameters

VarCorr(m1)
##  Groups   Name        Std.Dev. Corr 
##  ID       (Intercept) 0.089062      
##  RID      SEXf1       0.095182      
##           SEXf2       0.135234 0.000
##  HHID     (Intercept) 0.052327      
##  Residual             0.093239
VarCorr(m2)
##  Groups   Name        Std.Dev. Corr 
##  ID       SEXf1       0.101001      
##           SEXf2       0.105480 0.421
##  RID      SEXf1       0.081500      
##           SEXf2       0.123478 0.000
##  HHID     (Intercept) 0.051211      
##  Residual             0.108221

Test the sex-specificity hypothesis

anova(m0, m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: phen
## Models:
## m0: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | HHID) + (1 | ID)
## m1: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | ID) + (0 + SEXf | RID) + (1 | 
## m1:     HHID)
## m2: BMI ~ AGEsc + AGEsc2 + SEXf + (0 + SEXf | ID) + (0 + SEXf | RID) + 
## m2:     (1 | HHID)
##    Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)  
## m0  7 -359.40 -331.64 186.70  -373.40                           
## m1 10 -362.06 -322.39 191.03  -382.06 8.6533      3    0.03427 *
## m2 12 -362.91 -315.32 193.46  -386.91 4.8551      2    0.08825 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Models (7/7)

  • Polygenic (quantitative trait)
  • Polygenic (binary trait)
  • Polygenic (counts)


  • Association (quantitative trait)
  • Linkage (quantitative trait)


  • GxE (sex-especificity)
  • GxE (ageing)

Ageing model in SOLAR


(Glahn et al. 2013)

\(y_i = \mu + x_i \beta + g_i + e_i\)

\(\Omega_{i,j} = G_{i,j} \sigma_g^2 + I_{i,j} \sigma_e^2\)

\(\sigma_g^2 = [exp(\alpha_g + \gamma_g \delta_i)]^{0.5} \times \\ \mbox{ } \mbox{ } \mbox{ } [exp(\alpha_g + \gamma_g \delta_j)]^{0.5} \times \\ \mbox{ } \mbox{ } \mbox{ } exp(-\lambda |\delta_i - \delta_j|)\)

\(\sigma_e^2 = [exp(\alpha_e + \gamma_e \delta_i)]\)


(J. Blangero 2009)

Ageing for BMI (GAIT1)

m0 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|HHID) + (1|ID), 
  phen, relmat = list(ID = dkin))

m1 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|ID) + (1 + AGEsc|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 3)))
  
m2 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1 + AGEsc|ID) + (1 + AGEsc|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 5)))

Examine the models' parameters

VarCorr(m1)
##  Groups   Name        Std.Dev. Corr 
##  ID       (Intercept) 0.085639      
##  RID      (Intercept) 0.112998      
##           AGEsc       0.037605 0.000
##  HHID     (Intercept) 0.057153      
##  Residual             0.078124
VarCorr(m2)
##  Groups   Name        Std.Dev. Corr  
##  ID       (Intercept) 0.088349       
##           AGEsc       0.057975 -0.011
##  RID      (Intercept) 0.100757       
##           AGEsc       0.011939 0.000 
##  HHID     (Intercept) 0.062106       
##  Residual             0.068442

Test the ageing hypothesis

anova(m0, m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: phen
## Models:
## m0: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | HHID) + (1 | ID)
## m1: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | ID) + (1 + AGEsc | RID) + 
## m1:     (1 | HHID)
## m2: BMI ~ AGEsc + AGEsc2 + SEXf + (1 + AGEsc | ID) + (1 + AGEsc | 
## m2:     RID) + (1 | HHID)
##    Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)  
## m0  7 -359.40 -331.64 186.70  -373.40                           
## m1 10 -354.91 -315.25 187.45  -374.91 1.5054      3    0.68102  
## m2 12 -357.87 -310.28 190.94  -381.87 6.9658      2    0.03072 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Dichotomized ageing model

# Common polygenic model
m0 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|HHID) + (1|ID), 
  phen, relmat = list(ID = dkin))

m1 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (1|ID) + (0 + AGEf2|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 3)))

m2 <- relmatLmer(
  BMI ~ AGEsc + AGEsc2 + SEXf + (0 + SEXf|ID) + (0 + AGEf2|RID) + (1|HHID), 
  phen, relmat = list(ID = dkin), 
  weights = rep(1e10, nrow(phen)), vcControl = list(rho0 = list(rid = 5)))

Examine the models' parameters

VarCorr(m1)
##  Groups   Name        Std.Dev. Corr 
##  ID       (Intercept) 0.087055      
##  RID      AGEf20      0.116760      
##           AGEf21      0.123646 0.000
##  HHID     (Intercept) 0.057289      
##  Residual             0.098676
VarCorr(m2)
##  Groups   Name        Std.Dev. Corr 
##  ID       SEXf1       0.079366      
##           SEXf2       0.123532 0.417
##  RID      AGEf20      0.101516      
##           AGEf21      0.110564 0.000
##  HHID     (Intercept) 0.056166      
##  Residual             0.085696

Test the hypothesis

anova(m0, m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: phen
## Models:
## m0: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | HHID) + (1 | ID)
## m1: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | ID) + (0 + AGEf2 | RID) + 
## m1:     (1 | HHID)
## m2: BMI ~ AGEsc + AGEsc2 + SEXf + (0 + SEXf | ID) + (0 + AGEf2 | 
## m2:     RID) + (1 | HHID)
##    Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)   
## m0  7 -359.40 -331.64 186.70  -373.40                            
## m1 10 -353.53 -313.87 186.76  -373.53 0.1284      3   0.988221   
## m2 12 -359.43 -311.84 191.72  -383.43 9.9002      2   0.007083 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Thank you

More examples

Examples

  • ABO covariate effect on Thrombosis
    • comparision of two tests, t-test vs. anova
  • Categorical age effect on Thrombosis
    • increasing disease risk with age (non-linearly?)
  • Sex-specificity for BMI (GAIT2)
    • The effect is more pronounced that in GAIT1

ABO covariate effect (t-test)

K <- 0.05
dat <- mutate(subset(phen2, !is.na(ABOf3num)), offset = log(K/(1 - K)))

m <- relmatGlmer(Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial)
summaryCoef(m, signif.legend = TRUE) 
##          Estimate Std. Error z value Pr(>|z|)    
## AGEsc      1.4220     0.1673   8.498   <2e-16 ***
## SEXfnum    0.3171     0.2643   1.200   0.2302    
## ABOf3num  -0.5662     0.2384  -2.374   0.0176 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ABO covariate effect (anova)

K <- 0.05
dat <- mutate(subset(phen2, !is.na(ABOf3num)), offset = log(K/(1 - K)))

m <- relmatGlmer(Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial)
m0 <- update(m, . ~ . - ABOf3num)
anova(m, m0) 
## Data: dat
## Models:
## m0: Throm ~ AGEsc + SEXfnum + (1 | ID) - 1
## m: Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1 | ID)
##    Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
## m0  3 584.09 598.54 -289.04   578.09                            
## m   4 577.82 597.08 -284.91   569.82 8.2699      1   0.004031 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ABO covariate effect (summary)

Model Estimate Std. Error z-value
Intercept-free -0.5584 0.2160 -2.585
Intercept (prevalence) -0.5662 0.2384 -2.374


Model p-value (t-test) p-value (anova)
Intercept-free 0.00973 (**) 0.005092 (**)
Intercept (prevalence) 0.0176 (*) 0.004031 (**)

Categorical age effect

Factor level of AGEf Range of AGE Number of ind.
0 [ 2.6 - 54.8 ] 715
1 [ 55.2 - 64 ] 79
2 [ 64.1 - 73.8 ] 81
3 [ 74.1 - 83.9 ] 43
4 [ 84.2 - 101.1 ] 17

Categorical age effect

K <- 0.05
dat <- mutate(phen2, offset = -qnorm(1 - K))

m <- relmatGlmer(Throm ~ AGEf + SEXf + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial(probit))
summaryCoef(m)
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.05822    0.22775   0.256  0.79824    
## AGEf1        0.85723    0.21692   3.952 7.75e-05 ***
## AGEf2        1.19339    0.21552   5.537 3.07e-08 ***
## AGEf3        1.43971    0.27434   5.248 1.54e-07 ***
## AGEf4        1.70217    0.40137   4.241 2.23e-05 ***
## SEXf2        0.18430    0.13522   1.363  0.17288    
## ABOf3num    -0.28707    0.10718  -2.679  0.00739 **

Sex-specificity for BMI (GAIT2)

m0 <- relmatLmer(BMI ~ AGEsc + AGEsc2 + SEXf + (1|HHID) + (1|ID), 
  phen2, relmat = list(ID = dkin2))

m1 <- relmatLmer(BMI ~ AGEsc + AGEsc2 + SEXf + 
    (1|ID) + (0 + SEXf|RID) + (1|HHID), 
  phen2, relmat = list(ID = dkin2), 
  weights = rep(1e10, nrow(phen2)), vcControl = list(rho0 = list(rid = 3)))
  
m2 <- relmatLmer(BMI ~ AGEsc + AGEsc2 + SEXf + 
    (0 + SEXf|ID) + (0 + SEXf|RID) + (1|HHID), 
  phen2, relmat = list(ID = dkin2), 
  weights = rep(1e10, nrow(phen2)), vcControl = list(rho0 = list(rid = 5)))    

Sex-specificity for BMI (GAIT2)

VarCorr(m1)
##  Groups   Name        Std.Dev. Corr 
##  ID       (Intercept) 0.103256      
##  RID      SEXf1       0.089512      
##           SEXf2       0.136570 0.000
##  HHID     (Intercept) 0.033367      
##  Residual             0.101783
VarCorr(m2)
##  Groups   Name        Std.Dev. Corr 
##  ID       SEXf1       0.085383      
##           SEXf2       0.132376 0.804
##  RID      SEXf1       0.099653      
##           SEXf2       0.113160 0.000
##  HHID     (Intercept) 0.036435      
##  Residual             0.094210

Sex-specificity for BMI (GAIT2)

anova(m0, m1, m2)
## refitting model(s) with ML (instead of REML)
## Data: phen2
## Models:
## m0: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | HHID) + (1 | ID)
## m1: BMI ~ AGEsc + AGEsc2 + SEXf + (1 | ID) + (0 + SEXf | RID) + (1 | 
## m1:     HHID)
## m2: BMI ~ AGEsc + AGEsc2 + SEXf + (0 + SEXf | ID) + (0 + SEXf | RID) + 
## m2:     (1 | HHID)
##    Df     AIC     BIC logLik deviance  Chisq Chi Df Pr(>Chisq)    
## m0  7 -858.56 -824.68 436.28  -872.56                             
## m1 10 -879.32 -830.91 449.66  -899.32 26.755      3  6.626e-06 ***
## m2 12 -885.72 -827.63 454.86  -909.72 10.405      2   0.005504 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Caveats

Caveats

  • Disease prevalence 1%

Disease prevalence 1%

K <- 0.01
dat <- mutate(phen2, offset = log(K/(1 - K)))

m1 <- relmatGlmer(Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial) 
summaryCoef(m1) # offset = log(K/(1 - K)) = -4.59512  
##          Estimate Std. Error z value Pr(>|z|)    
## AGEsc       4.710      1.231   3.825 0.000131 ***
## SEXfnum    -3.170      1.917  -1.653 0.098321 .  
## ABOf3num   -5.971      1.873  -3.188 0.001434 **

Disease prevalence 1% (nloptwrap)

m2 <- relmatGlmer(Throm ~ -1 + AGEsc + SEXfnum + ABOf3num + (1|ID), dat, 
  offset = offset, relmat = list(ID = dkin2), family = binomial,
  control = glmerControl(optimizer = "nloptwrap"))
## Warning in lme4:::checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.00650688 (tol =
## 0.001, component 1)
summaryCoef(m2) # offset = log(K/(1 - K)) = -4.59512
##          Estimate Std. Error z value Pr(>|z|)    
## AGEsc       4.709      1.231   3.824 0.000131 ***
## SEXfnum    -3.159      1.912  -1.652 0.098482 .  
## ABOf3num   -5.962      1.870  -3.188 0.001435 **

References

Blangero, J. 2009. “Statistical genetic approaches to human adaptability. 1993.” Human Biology 81 (5-6): 523–46. doi:10.3378/027.081.0603.

Chen, Wei-min, and R Abecasis. 2007. “Family-Based Association Tests for Genomewide Association Scans” 81 (November): 913–26. doi:10.1086/521580.

Glahn, David C, Jack W Kent, Emma Sprooten, Vincent P Diego, Anderson M Winkler, Joanne E Curran, D Reese McKay, et al. 2013. “Genetic basis of neurocognitive decline and reduced white-matter integrity in normal human brain aging.” Proceedings of the National Academy of Sciences of the United States of America 110 (47): 19006–11. doi:10.1073/pnas.1313735110.

Soria, Jos é Manuel, Laura Almasy, Juan Carlos Souto, Delphine Bacq, Alfonso Buil, Alexandra Faure, Elisabeth Mart ínez-March án, et al. 2002. “A Quantitative-Trait Locus in the Human Factor XII Gene Influences Both Plasma Factor XII Levels and Susceptibility to Thrombotic Disease.” The American Journal of Human Genetics 70 (3). Elsevier: 567–74.

Souto, Juan Carlos, Laura Almasy, Montserrat Borrell, Francisco Blanco-Vaca, Jos é Mateo, Jos é Manuel Soria, Inma Coll, et al. 2000. “Genetic Susceptibility to Thrombosis and Its Relationship to Physiological Risk Factors: The GAIT Study.” The American Journal of Human Genetics 67 (6). Elsevier: 1452–59.

Tang, Weihong, Christine Schwienbacher, Lorna M Lopez, Yoav Ben-Shlomo, Tiphaine Oudot-Mellakh, Andrew D Johnson, Nilesh J Samani, et al. 2012. “Genetic associations for activated partial thromboplastin time and prothrombin time, their gene expression profiles, and risk of coronary artery disease.” American Journal of Human Genetics 91 (1): 152–62. doi:10.1016/j.ajhg.2012.05.009.

Zaitlen, Noah, Bogdan Paşaniuc, Nick Patterson, Samuela Pollack, Benjamin Voight, Leif Groop, David Altshuler, et al. 2012. “Analysis of Case–control Association Studies with Known Risk Variants.” Bioinformatics 28 (13): 1729–37. doi:10.1093/bioinformatics/bts259.

Ziyatdinov, Andrey, Helena Brunel, Angel Martinez-Perez, Alfonso Buil, Alexandre Perera, and Jose Manuel Soria. 2015. “solarius: An R Interface to SOLAR.” Bioinformatics, no. February: 1–2. doi:10.1093/bioinformatics/btw080.