Formula Sheet
Formula Sheet
n n
1 X 1 X
Least squares estimates: b1 = 2 (xi − x̄)yi = 2 (xi − x̄)(yi − ȳ) and b0 = ȳ − b1 · x
sxx i=1 sxx i=1
n n n n
1X 1X
with x = xi , y = yi , s2xx = (xi − x)2 , s2yy = (yi − y)2 , and
X X
n i=1 n i=1 i=1 i=1
n
X
s2xy = (xi − x) (yi − y)
i=1
x2 σ2
1
b0 ∼ N β0 , σ 2
+ 2 and b1 ∼ N β1 , 2
n sxx sxx
n
1 X
Estimate for σ : s = 2
(yi − ybi )2 with ybi = b0 + b1 xi
2
n − 2 i=1
100(1 − α)%-condence interval
s
for the slope: b1 ± tn−2,1− α2 ·
sxx
v " #
∗ − x)2
u
1 (x
at x∗ for the mean: b0 + b1 x∗ ± tn−2,1− α2
u
· ts2 +
n s2xx
v " #
∗ − x)2
u
1 (x
at x∗ for a new observation: b0 + b1 x∗ ± tn−2,1− α2
u
· ts 2 1 + +
n s2xx
b1 − β1∗
t-test H0 : β1 = β1∗ for the slope: ∼ tn−2
s/sxx
x − x2
Two-sample t-test H0 : µ1 = µ2 : test statistic T = p 1 ∼ tn1 +n2 −2
sp 1/n1 + 1/n2
(n1 − 1)s21 + (n2 − 1)s22
with s2p = and s21 , s22 the sample variance of the two groups
n1 + n2 − 2
n
Sum of squares total: SStotal =
X
(yi − y)2 = s2yy
i=1
n
Sum of squares regression: SSregression =
X
yi − y)2
(b
i=1
n
Sum of squares residuals: SSresidual =
X
(yi − ybi )2
i=1
Residuals e = y − ŷ = (In − H)y , e ∼ N (0, σ 2 (In − H)) with hat matrix H = X(X ′ X)−1 X ′
n k
1
Estimate of σ 2 : s2 = (yi − ybi )2 with ybi = b0 +
X X
bj xij
n − k − 1 i=1 j=1
q
Standard error of b: se(bj ) = s j,j , j ∈ {0, . . . , k}
(X ′ X)−1
bj − β ∗
t-test H0 : βj = β ∗ for individual regression coecients: ∼ tn−k−1
se(bj )
100(1 − α)%-condence interval
for βj : bj ± tn−k−1,1− α2 · se(bj )
q
at a for the mean: a b ± tn−k−1,1− α2 s a′ (X ′ X)−1 a
′
q
at a for a new observation: a b ± tn−k−1,1− α2 s a′ (X ′ X)−1 a + 1
′
SSregression /k
F-test of H0 : β1 = . . . = βk = 0, test statistic: F = ∼ Fk,n−k−1
SSresidual /(n − k − 1)
restrict f ull
(SSresidual − SSresidual )/a
F-test of sets of linear hypotheses H0 : Aβ = 0 F = f ull
∼ Fa,n−k−1 ,
SSresidual /(n − k − 1)
where a is the rank of the matrix A and k is the number of predictors in the full model.
SSregression , SSresidual , SStotal , and R2 = r2 : see Simple Linear Regression.
Replace n − k − 1 with n − p, where p is the total number of parameters for general models with
or without an intercept.
3 Specication
One-way analysis of variance (ANOVA): F-test Pk of H0 : k 1groups
Pk have equal means, P ni , ȳi = number,
sample mean of group i, i = 1, . . . , k, n = i=1 ni , ȳ = n i=1 ni ȳi , SSregression = ki=1 ni (ȳi − ȳ)2 ,
SSregression /(k − 1)
− ȳi )2 , F =
Pk Pni
SSresidual = i=1 j=1 (yij ∼ Fk−1,n−k
SSresidual /(n − k)
4 Model diagnostics
ei
Standardized residuals esi = , s dened in Multiple Linear Regression
s
ei
Studentized residuals: di = √ ∼ N (0, 1) , where hii is the ith diagonal element of the
s 1 − hii
hat matrix, the leverage of case i
Pn
et et−k
Sample autocorrelation of residuals rk = t=k+1Pn 2 , k = 1, 2, . . .
t=1 et
Pn
(et − et−1 )2
Durbin-Watson test statistic: DW = t=2Pn 2 ≈ 2(1 − r1 )
t=1 et
i=1
i=1 j=1
k
Lack of t sum of squares: LF SS = − P ESS for an assumed model
X
ni (y i − ybi )2 = SSresidual
restrict
i=1
LF SS/(k − dim(β))
Lack of t test statistic ∼ Fk−dim(β),n−k for
P ESS/(n − k)
Hrestricted : µres = µres (xi , β) vs. Hf ull : µ = β1 I(xi = x1 ) + . . . + βk I(xi = xk )
(yiλ −1)
Box-Cox transformations: Find λ such that yi minimizes SSres (λ), where y g is the
(λ)
= λy gλ−1
geometric mean of y
6 Model selection
SSresidual /(n − k − 1)
Adjusted R2 : Radj
2
=1−
SStotal /(n − 1)
Akaike Information Criterion: AIC = n log(SSresidual /n) + 2p , where p is the total number of
parameters
Bayesian Information Criterion: BIC = n log(SSresidual /n) + log(n)p , where p is the total number
of parameters
7 Survival regression
Survivor function: S(t) = P (T ≥ t) = 1 − F (t) , with T survival time
P (t ≤ T < t + δ|T ≥ t) f (t)
Hazard function: h(t) = lim = , where f (t) = F ′ (t)
δ→0 δ S(t)
Cumulative hazard: H(t) = − log(S(t))
Number of individuals with survival times ≥ t
Empirical survivor function: Se (t) =
Number of individuals in data set
Average number of individuals at risk in interval j : n′j = nj − cj /2 , with assumption that censored
cases occur uniformly throughout j th interval and
dj = number of deaths,
cj = censored cases in interval,
nj = number at risk (alive) at start of interval
Probability of death in j th interval: dj /n′j
j=1
n′j
[tj , tj+1 ) j th interval for j = 1, ..., m intervals
k
(nj − dj )
Kaplan-Meier estimator: SKM (t) = t ∈ [t(k) , t(k+1) ), k = 1, ..., r ,
Y
j=1
nj
k
Nelson-Aalen estimator: SN A (t) = exp(−dj /nj ) t ∈ [t(k) , t(k+1) ), k = 1, ..., r ,
Y
j=1
with t(1) < t(2) < ... < t(r) ordered, unique observed death times
Greenwood's formula for pointwise (1 − α)-CI for SKM (t): SKM (t) ± z1−α/2 ∗ se(SKM (t)) , with
k
!1/2
dj
, for t(k) ≤ t < t(k+1)
X
se(SKM (t)) ≈ SKM (t)
j=1
nj (nj − d j )
dj n n d (n − dj )
e1j = E[d1j ] = n1j and v1j = Var(d1j ) = 1j 2j2 j j
nj nj (nj − 1)
r r
UL2
Log-rank test: UL = (d1j − e1j ) , Var(UL ) = VL = v1j , under H0
X X
∼ χ21
j=1 j=1
VL
r r 2
UW
Wilcoxon test: UW = nj (d1j − e1j ) , Var(UW ) = VW = n2j v1j , under H0
X X
∼ χ21
j=1 j=1
VW
N
!−1 N
Estimator for xed θ: β̂(θ) = Xi′ Wi Yi with Wi = Vi (θ)−1
X X
Xi′ Wi Xi
i=1 i=1
m
nk π(xk , β)[1 − π(xk , β)]xk x′k and asymptotically Var(b) ≈ I(b)−1 ,
X
I(b) = E[−D log(L(b))] =
2
k=1
where b is the mle of β , L(b) the log-likelihood evaluated at b
p
se(bk ) = Var(b)kk
bk
Wald test for individual coecients H0 : βk = 0: Z = ∼ N (0, 1)
se(bk )
(1 − α) - CI for OR: exp(bk ± z1−α/2 se(bk ))
L(bres )
Likelihood Ratio Test for restricted vs. full model: LRT = −2 log ∼ χ2a , where a =
L(b)
dim(b) − dim(bres )
m
[yk − nk π̂k ]2
Pearson chi-square: ∼ χ2m−p , where m is the number of constellations, π̂k the
X
k=1
n π̂
k k (1 − π̂ k )
estimated probabilities from a model with p parameters
G
(ok − nk π̄k )2
Hosmer and Lemeshow statistic: ∼ χ2g−2 , where G is the number of
X
HL =
k=1
nk π̄k (1 − π̄k )
percentile groups, with ok observed frequencies in group k, nk number of observations in group k
π̄k average estimated probability for group k
Hat matrix H = V −1/2 X(X ′ V −1 X)−1 X ′ V −1/2 , where V −1 = diag(ni π̂i (1 − π̂i ))
ri yi − ni π̂i
Standardized Pearson residuals: ris = √ =p √
1 − hii nk π̂k (1 − π̂k ) 1 − hii
Cook's inuence measure: Di = (b − b(i) )′ (X ′ V −1 X)−1 (b − b(i) ) ≈ (ris )2 hii (1 − hii )
10 Spatio-temporal statistics
Observations {Z(si ; tj )} at spatial locations {si : i = 1, . . . , m}, times {tj : j = 1, . . . , T }
T
1X
At location si and across all times, empirical mean: µ̂z,s (si ) = Z(si ; tj ) and covariance:
T j=1
T
1X
Ĉz(0) (si , sk ) = (Z(si ; tj ) − µ̂z,s (si )) (Z(sk ; tj ) − µ̂z,s (sk ))
T j=1
At time tj , Ztj = (Z(s1 ; tj ), . . . , Z(sm ; tj ))′ , µ̂z,s = (µ̂z,s (s1 ), . . . , µ̂z,s (sm ))′ = T1 Tj=1 Ztj ∈ Rm
P
Principal component analysis (PCA) decomposition: Ĉz = ΨΛΨ′ , where Ψ, Λ ∈ Rm×m , empirical
(0)
1
w̃ij (s0 ; t0 ) = with d(·, ·): distance, α: smoothing parameter
d ((sij ; tj ), (s0 ; t0 ))α
Kernel predictors: w̃ij (s0 ; t0 ) = k((sij ; tj ), (s0 ; t0 ); θ) with k : kernel function, θ: bandwidth param-
eter
Regression or trend surface estimation: Z(si ; tj ) = β0 + β1 X1 (si ; tj ) + · · · + βp Xp (si ; tj ) + e(si ; tj )
with e(si ; tj ) ∼ N (0, σe2 ) indep
T X
m 2
Residual sum of squares: RSS = with tted values Ẑ(si ; tj ) =
X
Z(si ; tj ) − Ẑ(si ; tj )
j=1 i=1
β̂0 + β̂1 X1 (si ; tj ) + · · · + β̂p Xp (si ; tj ) and ordinary least squares estimates β̂0 , β̂1 , . . . , β̂p
γ̂e (∥h1 ∥; τ1 )
F-test for H0 : spatio-temporal independence: F = − 1 with γ̂e (∥h1 ∥; τ1 ): empirical
σ̂e2
semivariogram estimate at the smallest spatial (∥h1 ∥) and temporal (τ1 ) lags, σ̂e2 : residual error
estimate
Data model: Z = Y + ϵ with Y = (Y (s11 ; t1 ), . . . , Y (smT T ; tT ))′ , ϵ = (ϵ(s11 ; t1 ), . . . , ϵ(smT T ; tT ))′
is random
Process model: Y = µ + η with µ = (µ(s11 ; t1 ), . . . , µ(smT T ; tT ))′ = Xβ is xed,
η = (η(s11 ; t1 ), . . . , η(smT T ; tT ))′ is random, independent of ϵ
Covariances: Cov(Y ) = Cη , Cov(Z) = Cη + Cϵ
j=1
Model averaging: [g|Z] = l=1 [g|Z, Ml ]P (Ml |Z), P (Ml |Z) = PL[Z|M , prior model prob-
PL l ]P (Ml )
Bayesian information criterion: BIC(Ml ) = −2 log[Z|θ̂, Ml ] + log(m∗ )pl , m∗ is the sample size