0% found this document useful (0 votes)

33 views8 pages

Formula Sheet

This document provides a 3-sentence summary of the key information from the technical document: The document outlines formulas and statistical methods for simple and multiple linear regression, including equations for estimating coefficients, residuals, variance, and confidence intervals. It also covers model diagnostics such as standardized residuals, studentized residuals, autocorrelation of residuals, and the Durbin-Watson test. Multiple linear regression, ANOVA, and hypothesis testing procedures are described.

Uploaded by

Viktoria Weidel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views8 pages

Formula Sheet

Uploaded by

Viktoria Weidel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Technical University Munich MA4401 Applied Regression

Department of Mathematics Formula Sheet

Prof. Donna Ankerst Winter Term 2021/2022

1 Simple linear regression

yi = β0 + β1 xi + εi with εi ∼ N (0, σ 2 )
iid

n n
1 X 1 X
Least squares estimates: b1 = 2 (xi − x̄)yi = 2 (xi − x̄)(yi − ȳ) and b0 = ȳ − b1 · x
sxx i=1 sxx i=1
n n n n
1X 1X
with x = xi , y = yi , s2xx = (xi − x)2 , s2yy = (yi − y)2 , and
X X
n i=1 n i=1 i=1 i=1
n
X
s2xy = (xi − x) (yi − y)
i=1

x2 σ2

1
b0 ∼ N β0 , σ 2
+ 2 and b1 ∼ N β1 , 2
n sxx sxx
n
1 X
Estimate for σ : s = 2
(yi − ybi )2 with ybi = b0 + b1 xi
2
n − 2 i=1
100(1 − α)%-condence interval
s
for the slope: b1 ± tn−2,1− α2 ·
sxx
v " #
∗ − x)2
u
1 (x
at x∗ for the mean: b0 + b1 x∗ ± tn−2,1− α2
u
· ts2 +
n s2xx
v " #
∗ − x)2
u
1 (x
at x∗ for a new observation: b0 + b1 x∗ ± tn−2,1− α2
u
· ts 2 1 + +
n s2xx

b1 − β1∗
t-test H0 : β1 = β1∗ for the slope: ∼ tn−2
s/sxx
x − x2
Two-sample t-test H0 : µ1 = µ2 : test statistic T = p 1 ∼ tn1 +n2 −2
sp 1/n1 + 1/n2
(n1 − 1)s21 + (n2 − 1)s22
with s2p = and s21 , s22 the sample variance of the two groups
n1 + n2 − 2
n
Sum of squares total: SStotal =
X
(yi − y)2 = s2yy
i=1
n
Sum of squares regression: SSregression =
X
yi − y)2
(b
i=1
n
Sum of squares residuals: SSresidual =
X
(yi − ybi )2
i=1

SSregression SSresidual s4xy

R-squared: r = 2
=1− = 2 2
SStotal SStotal sxx syy
s2xy
Pearson correlation: r =
sxx syy

2 Multiple linear regression

k
βj xij + εi with εi ∼ N (0, σ 2 ) , k is the number of covariates
iid
X
yi = β0 +
j=1

Least squares estimate b = (b0 , b1 , . . . , bk )′ = (X ′ X) X ′ y , b ∼ N (β, σ 2 (X ′ X)−1 )

−1

Residuals e = y − ŷ = (In − H)y , e ∼ N (0, σ 2 (In − H)) with hat matrix H = X(X ′ X)−1 X ′
n k
1
Estimate of σ 2 : s2 = (yi − ybi )2 with ybi = b0 +
X X
bj xij
n − k − 1 i=1 j=1
q
Standard error of b: se(bj ) = s j,j , j ∈ {0, . . . , k}
(X ′ X)−1

bj − β ∗
t-test H0 : βj = β ∗ for individual regression coecients: ∼ tn−k−1
se(bj )
100(1 − α)%-condence interval
for βj : bj ± tn−k−1,1− α2 · se(bj )
q
at a for the mean: a b ± tn−k−1,1− α2 s a′ (X ′ X)−1 a
′

q
at a for a new observation: a b ± tn−k−1,1− α2 s a′ (X ′ X)−1 a + 1
′

SSregression /k
F-test of H0 : β1 = . . . = βk = 0, test statistic: F = ∼ Fk,n−k−1
SSresidual /(n − k − 1)
restrict f ull
(SSresidual − SSresidual )/a
F-test of sets of linear hypotheses H0 : Aβ = 0 F = f ull
∼ Fa,n−k−1 ,
SSresidual /(n − k − 1)
where a is the rank of the matrix A and k is the number of predictors in the full model.
SSregression , SSresidual , SStotal , and R2 = r2 : see Simple Linear Regression.
Replace n − k − 1 with n − p, where p is the total number of parameters for general models with
or without an intercept.

3 Specication
One-way analysis of variance (ANOVA): F-test Pk of H0 : k 1groups
Pk have equal means, P ni , ȳi = number,
sample mean of group i, i = 1, . . . , k, n = i=1 ni , ȳ = n i=1 ni ȳi , SSregression = ki=1 ni (ȳi − ȳ)2 ,
SSregression /(k − 1)
− ȳi )2 , F =
Pk Pni
SSresidual = i=1 j=1 (yij ∼ Fk−1,n−k
SSresidual /(n − k)

4 Model diagnostics
ei
Standardized residuals esi = , s dened in Multiple Linear Regression
s
ei
Studentized residuals: di = √ ∼ N (0, 1) , where hii is the ith diagonal element of the
s 1 − hii
hat matrix, the leverage of case i
Pn
et et−k
Sample autocorrelation of residuals rk = t=k+1Pn 2 , k = 1, 2, . . .
t=1 et
Pn
(et − et−1 )2
Durbin-Watson test statistic: DW = t=2Pn 2 ≈ 2(1 − r1 )
t=1 et

High leverage if hii > 2(k + 1)/n

(b − b(i) )′ X ′ X(b − b(i) )
Cook's D-statistic: D = , where b(i) is the estimate of β with the ith case
(k + 1)s2
ei
deleted and b − b(i) = (X ′ X)−1 xi
1 − hii
n
Prediction error sum of squares: P RESS = e2(i) with the PRESS residual e(i) = yi − x′i b(i)
X

i=1

5 Lack of t, transformations

ni
k X
Pure error sum of squares: P ESS = with k the number of groups
X
(yij − y i )2 = SSresidual
f ull

i=1 j=1

k
Lack of t sum of squares: LF SS = − P ESS for an assumed model
X
ni (y i − ybi )2 = SSresidual
restrict

i=1

LF SS/(k − dim(β))
Lack of t test statistic ∼ Fk−dim(β),n−k for
P ESS/(n − k)
Hrestricted : µres = µres (xi , β) vs. Hf ull : µ = β1 I(xi = x1 ) + . . . + βk I(xi = xk )
(yiλ −1)
Box-Cox transformations: Find λ such that yi minimizes SSres (λ), where y g is the
(λ)
= λy gλ−1
geometric mean of y

6 Model selection
SSresidual /(n − k − 1)
Adjusted R2 : Radj
2
=1−
SStotal /(n − 1)
Akaike Information Criterion: AIC = n log(SSresidual /n) + 2p , where p is the total number of
parameters
Bayesian Information Criterion: BIC = n log(SSresidual /n) + log(n)p , where p is the total number
of parameters

7 Survival regression
Survivor function: S(t) = P (T ≥ t) = 1 − F (t) , with T survival time
P (t ≤ T < t + δ|T ≥ t) f (t)
Hazard function: h(t) = lim = , where f (t) = F ′ (t)
δ→0 δ S(t)
Cumulative hazard: H(t) = − log(S(t))
Number of individuals with survival times ≥ t
Empirical survivor function: Se (t) =
Number of individuals in data set
Average number of individuals at risk in interval j : n′j = nj − cj /2 , with assumption that censored
cases occur uniformly throughout j th interval and
dj = number of deaths,
cj = censored cases in interval,
nj = number at risk (alive) at start of interval
Probability of death in j th interval: dj /n′j

Probability of survival in j th inteval: 1 − dj /n′j = (n′j − dj )/n′j

k
(n′j − dj )
Life table estimator: Slif e (t) = t ∈ [tk , tk+1 ), k = 1, ..., m , with
Y

j=1
n′j
[tj , tj+1 ) j th interval for j = 1, ..., m intervals
k
(nj − dj )
Kaplan-Meier estimator: SKM (t) = t ∈ [t(k) , t(k+1) ), k = 1, ..., r ,
Y

j=1
nj
k
Nelson-Aalen estimator: SN A (t) = exp(−dj /nj ) t ∈ [t(k) , t(k+1) ), k = 1, ..., r ,
Y

j=1
with t(1) < t(2) < ... < t(r) ordered, unique observed death times

Greenwood's formula for pointwise (1 − α)-CI for SKM (t): SKM (t) ± z1−α/2 ∗ se(SKM (t)) , with
k
!1/2
dj
, for t(k) ≤ t < t(k+1)
X
se(SKM (t)) ≈ SKM (t)
j=1
nj (nj − d j )

Median survival time t(50): t(50) = min(observed time|S(time) ≤ 0.50)

Non-parametric test for dierences in survival curves:
H0 : Survival Group I = Survival Group II vs. HA : Survival Group I ̸= Survival Group II
t(1) < t(2) < ... < t(r) ordered, unique observed death times
d1j , d2j number of deaths; n1j , n2j number at risk at t(j) for Group I and Group II, respectively
dj nj −dj

n1j −d
Under H0 : d1j ∼ Hypergeometric distribution: P (d1j = d) = for d = 0, . . . , dj with
d
nj

n1j

dj n n d (n − dj )
e1j = E[d1j ] = n1j and v1j = Var(d1j ) = 1j 2j2 j j
nj nj (nj − 1)
r r
UL2
Log-rank test: UL = (d1j − e1j ) , Var(UL ) = VL = v1j , under H0
X X
∼ χ21
j=1 j=1
VL
r r 2
UW
Wilcoxon test: UW = nj (d1j − e1j ) , Var(UW ) = VW = n2j v1j , under H0
X X
∼ χ21
j=1 j=1
VW

Cox proportional hazard model equation: h(t|X) = P (T = t|T ≥ t, X) = h0 (t) exp(Xβ)

h(t|X = 1)
Hazard ratio for X = 1 vs. X = 0: HR = = exp(β)
h(t|X = 0)

8 Linear mixed models

Yi = Xi β + Zi bi + ei with bi ∼ Nq (0q , D), ei ∼ Nni (0ni , Ri ) independent
Yi ∼ N (Xi β, Vi (θ)) with Vi (θ) = Zi DZi′ + Ri
N
ni
Likelihood: L(β, θ) =
Y
(2π)− 2 |Vi (θ)|−1/2 exp(−1/2(Yi − Xi β)′ Vi (θ)−1 (Yi − Xi β))
i=1

N
!−1 N
Estimator for xed θ: β̂(θ) = Xi′ Wi Yi with Wi = Vi (θ)−1
X X
Xi′ Wi Xi
i=1 i=1

Best linear unbiased predictor: Ŷi = Xi β + Zi DZi′ Vi−1 (Yi − Xi β)

Raw residuals for individual i: ri = Yi − Xi βb , βb the mle for β

Random eect predictions: bbi = DZ

b ′ Vb −1 (Yi − Xi β)
i i
b

9 Logistic and Poisson regression

exp(x′i β)
Logistic regression Yi ∼ Ber(π(xi , β)) , with π(xi , β) = and the logistic link func-
1 + exp(x′i β)

P (Y = 1)
tion log = x′i β for i = 1, . . . , n
1 − P (Y = 1)
P (Y = 1|X) P (Y = 1|X)
Odds of Y = 1 for X : Odds(X) = =
P (Y = 0|X) 1 − P (Y = 1|X)
P (Y =1|X=1)
Odds(X = 1)
Odds ratio OR: OR = = exp(β) , with β the coecient for X .
P (Y =0|X=1)
= P (Y =1|X=0)
Odds(X = 0)
P (Y =0|X=0)

m
nk π(xk , β)[1 − π(xk , β)]xk x′k and asymptotically Var(b) ≈ I(b)−1 ,
X
I(b) = E[−D log(L(b))] =
2

k=1
where b is the mle of β , L(b) the log-likelihood evaluated at b
p
se(bk ) = Var(b)kk
bk
Wald test for individual coecients H0 : βk = 0: Z = ∼ N (0, 1)
se(bk )
(1 − α) - CI for OR: exp(bk ± z1−α/2 se(bk ))

L(bres )
Likelihood Ratio Test for restricted vs. full model: LRT = −2 log ∼ χ2a , where a =
L(b)
dim(b) − dim(bres )
m
[yk − nk π̂k ]2
Pearson chi-square: ∼ χ2m−p , where m is the number of constellations, π̂k the
X

k=1
n π̂
k k (1 − π̂ k )
estimated probabilities from a model with p parameters
G
(ok − nk π̄k )2
Hosmer and Lemeshow statistic: ∼ χ2g−2 , where G is the number of
X
HL =
k=1
nk π̄k (1 − π̄k )
percentile groups, with ok observed frequencies in group k, nk number of observations in group k
π̄k average estimated probability for group k
Hat matrix H = V −1/2 X(X ′ V −1 X)−1 X ′ V −1/2 , where V −1 = diag(ni π̂i (1 − π̂i ))
ri yi − ni π̂i
Standardized Pearson residuals: ris = √ =p √
1 − hii nk π̂k (1 − π̂k ) 1 − hii
Cook's inuence measure: Di = (b − b(i) )′ (X ′ V −1 X)−1 (b − b(i) ) ≈ (ris )2 hii (1 − hii )

Poisson regression Y ∼ P ois(µ) with log(µ) = β0 + β1 X1 + ... + βk Xk , k number of covariates

µy
Poisson distribution Y ∼ P ois(µ) : P (Y = y) = exp(−µ) , for y = 0, 1, 2, 3... and µ > 0
y!
(1 − α)-CI for mean ratio exp(bj ) of covariate j : exp(bj ± z1−α/2 se(bj ))

10 Spatio-temporal statistics
Observations {Z(si ; tj )} at spatial locations {si : i = 1, . . . , m}, times {tj : j = 1, . . . , T }
T
1X
At location si and across all times, empirical mean: µ̂z,s (si ) = Z(si ; tj ) and covariance:
T j=1
T
1X
Ĉz(0) (si , sk ) = (Z(si ; tj ) − µ̂z,s (si )) (Z(sk ; tj ) − µ̂z,s (sk ))
T j=1

At time tj , Ztj = (Z(s1 ; tj ), . . . , Z(sm ; tj ))′ , µ̂z,s = (µ̂z,s (s1 ), . . . , µ̂z,s (sm ))′ = T1 Tj=1 Ztj ∈ Rm
P

For τ = 0, 1, . . . , T − 1, empirical lag-τ covariance between 2 stations:

T
1
(Z(si ; tj ) − µ̂z,s (si )) (Z(sk ; tj − τ ) − µ̂z,s (sk )) , spatial covariance matrix:
X
(τ )
Ĉz (si , sk ) =
T − τ j=τ +1
T
1 X
Ĉz(τ ) = {Ĉz(τ ) (si , sk )} = (Zt − µ̂z,s )(Ztj −τ − µ̂z,s )′ ∈ Rm×m
T − τ j=τ +1 j
Cross-correlation between outcomes Ztj ∈ Rm and Xtj ∈ Rn :
T
1 X
(τ )
Ĉz,x = (Ztj − µ̂z,s )(Xtj −τ − µ̂x,s )′ ∈ Rm×n
T − τ j=τ +1
Neighborhoods: Nt (τ )= pairs of points within time lag τ of each other, Ns (h)= pairs of points
within spatial lag h of each other
Empirical spatio-temporal covariogram:
1 1
(Z(si ; tj ) − µ̂z,s (si )) (Z(sk ; tl ) − µ̂z,s (sk )) with |N (·)|=
X X
Ĉz (h; τ ) =
|Ns (h)| |Nt (τ )|
si ,sk ∈Ns (h) tj ,tl ∈Nt (τ )

number of elements in N (·), here the number of pairs

1
Semivariogram: γz (si , sk ; tj , tl ) = var(Z(si ; tj ) − Z(sk ; tl )) ≈ Ĉz (0; 0) − Ĉz (h; τ )
2
Stationary semivariogram: γz (h; τ ) = 12 var(Z(s+h; t+τ )−Z(s, t)) = 21 E ((Z(s + h; t + τ ) − Z(s, t))2 ) ≈
γ̂z (h; τ ) = |Ns1(h)| |Nt1(τ )| si ,sk ∈Ns (h) tj ,tl ∈Nt (τ ) (Z(si ; tj ) − Z(sk ; tl ))2
P P

Principal component analysis (PCA) decomposition: Ĉz = ΨΛΨ′ , where Ψ, Λ ∈ Rm×m , empirical
(0)

orthogonal functions (EOFs): eigenvectors ψk ∈ Rm that are columns of Ψ, eigenvalues: diagonal

elements λk ∈ R of Λ for k = 1, . . . , m
k th principal component time series: ak = {ak (tj ) : j = 1, . . . , T }, ak (tj ) = ψk′ Ztj ∈ R
k th canonical correlation for k = 1, . . . , min(m, n):
(0)
cov(ak , bk ) ξk′ Ĉz,x ψk
rk = cor(ak , bk ) = p p rk = 1/2 1/2 ∈ R , cov and var are cal-
var(ak ) var(bk ) ′ (0)
ξk Ĉz ξk ′ (0)
ψk Ĉx ψk
culated over the time domain, ak , bk ∈ Rm with elements ak (tj ) = ξk′ Ztj ∈ R, bk (tj ) = ψk′ Xtj ∈ R,
ξk and ψk are optimal weights
Long data format: Z(s11 ; t1 ), Z(s21 ; t1 ), . . . , Z(sm1 1 ; t1 ), . . . , Z(s1T ; tT ), Z(s2T ; tT ), . . . , Z(smT T ; tT )
Inverse distance weighting:
mj
T X
w̃ij (s0 ; t0 )
,
X
Ẑ(s0 ; t0 ) = wij (s0 ; t0 )Z(sij ; tj ), wij (s0 ; t0 ) = PT Pmk
j=1 i=1 k=1 l=1 w̃lk (s0 ; t0 )

1
w̃ij (s0 ; t0 ) = with d(·, ·): distance, α: smoothing parameter
d ((sij ; tj ), (s0 ; t0 ))α
Kernel predictors: w̃ij (s0 ; t0 ) = k((sij ; tj ), (s0 ; t0 ); θ) with k : kernel function, θ: bandwidth param-
eter
Regression or trend surface estimation: Z(si ; tj ) = β0 + β1 X1 (si ; tj ) + · · · + βp Xp (si ; tj ) + e(si ; tj )
with e(si ; tj ) ∼ N (0, σe2 ) indep
T X
m 2
Residual sum of squares: RSS = with tted values Ẑ(si ; tj ) =
X
Z(si ; tj ) − Ẑ(si ; tj )
j=1 i=1

β̂0 + β̂1 X1 (si ; tj ) + · · · + β̂p Xp (si ; tj ) and ordinary least squares estimates β̂0 , β̂1 , . . . , β̂p
γ̂e (∥h1 ∥; τ1 )
F-test for H0 : spatio-temporal independence: F = − 1 with γ̂e (∥h1 ∥; τ1 ): empirical
σ̂e2
semivariogram estimate at the smallest spatial (∥h1 ∥) and temporal (τ1 ) lags, σ̂e2 : residual error
estimate
Data model: Z = Y + ϵ with Y = (Y (s11 ; t1 ), . . . , Y (smT T ; tT ))′ , ϵ = (ϵ(s11 ; t1 ), . . . , ϵ(smT T ; tT ))′
is random
Process model: Y = µ + η with µ = (µ(s11 ; t1 ), . . . , µ(smT T ; tT ))′ = Xβ is xed,
η = (η(s11 ; t1 ), . . . , η(smT T ; tT ))′ is random, independent of ϵ
Covariances: Cov(Y ) = Cη , Cov(Z) = Cη + Cϵ

Estimate: β̂gls = (X ′ CZ−1 X)−1 X ′ CZ−1 Z

Kriging at location (s0 ; t0 ): c′0 = cov(Y (s0 ; t0 ), Z) ∈ R1× j=1 mj , c0,0 = var(Y (s0 ; t0 )) ∈ R1 ,
PT

predictor: Ŷ (s0 ; t0 ) = x(s0 ; t0 )′ β̂gls + c′0 Cz−1 (Z − X β̂gls ), variance: σY,uk

2
(s0 ; t0 ) = c0,0 − c′0 Cz−1 c0 + κ
with κ = (x(s0 ; t0 ) − X Cz c0 ) (X Cz X) (x(s0 ; t0 ) − X Cz 1c0 ), standard error: σY,uk (s0 ; t0 )
′ −1 ′ ′ −1 −1 ′ −

Spatio-temporal covariance function: c∗ (s, s′ ; t, t′ ) = cov(Y (s; t), Y (s′ , t′ ))

Second-order or weakly stationary: constant expectation and c∗ (s, s′ ; t, t′ ) = c∗ (s − s′ ; t − t′ ) = c(h; τ )
with h = s − s′ , τ = t − t′
c(h; τ )
Spatio-temporal correlation function: ρ(h; τ ) =
c(0; 0)
Dynamic spatio-temporal model (DSTM): Zt (·) = Ht (Yt (·), θd,t , ϵt (·)) , t = 1, . . . , T ; · denotes a
spatial location
First-order Markov model: Yt (·) = M(Yt−1 (·), θp,t , ηt (·))
Latent linear Gaussian DSTM: Zt = bt + Ht Yt + ϵt , ϵt ∼ Gau(0, Cϵ,t )
Continuous rst-order spatio-temporal integro-dierence equation (IDE) process:
Z
Yt (s) = m(s, x; θp )Yt−1 (x)dx + ηt (s)
Ds
n
Discretized IDE process: Yt (si ) = mij (θp )Yt−1 (sj ) + ηt (si ) , matrix form: Yt = M Yt−1 + ηt
X

j=1

Posterior predictive distribution: [Zppd |Z] =

Empirical marginal distribution: [Zemp ] = [Zemp |Y, θ̂][Y |θ̂]dY

Mean squared prediction error: M SP E = T1m Tj=1 m 2

P P
i=1 {Zν (si ; tj ) − Ẑν (si ; tj )}
Mean absolute prediction error: M AP E = T1m Tj=1 m
P P
i=1 |Zν (si ; tj ) − Ẑν (si ; tj )|

j=1 [Z|Mj ]P (Mj )

Akaike information criterion: AIC(Ml ) = −2 log[Z|θ̂, Ml ] + 2pl , pl is number of parameters

Bayesian information criterion: BIC(Ml ) = −2 log[Z|θ̂, Ml ] + log(m∗ )pl , m∗ is the sample size

Advanced Econometrics PDF
No ratings yet
Advanced Econometrics PDF
58 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
EconometricsII Exercises
100% (1)
EconometricsII Exercises
27 pages
STAT630Slide Adv Data Analysis
0% (1)
STAT630Slide Adv Data Analysis
238 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
55 pages
Regression Analysis Guide
100% (1)
Regression Analysis Guide
280 pages
Linear Regression Lecture Notes
100% (2)
Linear Regression Lecture Notes
228 pages
Basic Econometrics Health
No ratings yet
Basic Econometrics Health
183 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
14 pages
Scott and Watson CHPT 4 Solutions
No ratings yet
Scott and Watson CHPT 4 Solutions
4 pages
Basics of Regression Analysis
No ratings yet
Basics of Regression Analysis
63 pages
Biostatistics Lab - 4 Sem
No ratings yet
Biostatistics Lab - 4 Sem
17 pages
Report Writing For Data Science in R - Roger D. Peng
No ratings yet
Report Writing For Data Science in R - Roger D. Peng
120 pages
Multi Variate Regression
No ratings yet
Multi Variate Regression
52 pages
Lecture 2: Simple Linear Regression Model: Recap
No ratings yet
Lecture 2: Simple Linear Regression Model: Recap
5 pages
Linear Regression
No ratings yet
Linear Regression
56 pages
Chi Square Test SL&HL
No ratings yet
Chi Square Test SL&HL
8 pages
Intro to Simple Linear Regression
No ratings yet
Intro to Simple Linear Regression
11 pages
Econometrics Study Guide
No ratings yet
Econometrics Study Guide
9 pages
Linear Stochastic Models: 5.1 Least Squares
No ratings yet
Linear Stochastic Models: 5.1 Least Squares
12 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Ecom 165 Notes
No ratings yet
Ecom 165 Notes
98 pages
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
No ratings yet
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
23 pages
Regression Basics for Epidemiologists
No ratings yet
Regression Basics for Epidemiologists
18 pages
CH 24 Multi Variate
No ratings yet
CH 24 Multi Variate
34 pages
Financial Statistics - Formula Sheet
No ratings yet
Financial Statistics - Formula Sheet
26 pages
Inferential Statistics
No ratings yet
Inferential Statistics
7 pages
Seattle SISG 18 IntroQG Lecture08
No ratings yet
Seattle SISG 18 IntroQG Lecture08
21 pages
Stats101A - Chapter 2
No ratings yet
Stats101A - Chapter 2
59 pages
Chap 26 One Way Anova
No ratings yet
Chap 26 One Way Anova
38 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
Lecture BDS 3 23 24 Print
No ratings yet
Lecture BDS 3 23 24 Print
20 pages
Earnings Volatility & Predictability
No ratings yet
Earnings Volatility & Predictability
22 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Formula Sheet For Statistics
No ratings yet
Formula Sheet For Statistics
43 pages
Regress
No ratings yet
Regress
11 pages
Inference in Linear Regression Models With Many Covariates and Heteroskedasticity
No ratings yet
Inference in Linear Regression Models With Many Covariates and Heteroskedasticity
47 pages
Hypothesis Testing Lesson Plan
0% (2)
Hypothesis Testing Lesson Plan
5 pages
Stock Markets, Banks, and Growth: Panel Evidence: Thorsten Beck, Ross Levine
No ratings yet
Stock Markets, Banks, and Growth: Panel Evidence: Thorsten Beck, Ross Levine
20 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
Use Your Shoe!: Suggested Grade Range: 6-8 Approximate Time: 2 Hours State of California Content Standards
No ratings yet
Use Your Shoe!: Suggested Grade Range: 6-8 Approximate Time: 2 Hours State of California Content Standards
12 pages
What Are Confidence Intervals and P-Values?
0% (1)
What Are Confidence Intervals and P-Values?
8 pages
Chi Square
No ratings yet
Chi Square
13 pages
SHARPE Your Chi-Square Test Is Statistically Significant Now What
No ratings yet
SHARPE Your Chi-Square Test Is Statistically Significant Now What
10 pages
DS100 Sp22 Lec 09 - Intro To Modeling, SLR
No ratings yet
DS100 Sp22 Lec 09 - Intro To Modeling, SLR
69 pages
LM Ques PPR
No ratings yet
LM Ques PPR
8 pages
Cheatsheet
No ratings yet
Cheatsheet
4 pages
Project 2 Factor Hair Revised Case Study
No ratings yet
Project 2 Factor Hair Revised Case Study
25 pages
Tahun 2007
No ratings yet
Tahun 2007
3 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Kendall'S Tau - B: Hasil Uji Statistik
No ratings yet
Kendall'S Tau - B: Hasil Uji Statistik
4 pages
ANOVA Examples
No ratings yet
ANOVA Examples
5 pages
Chap 7
No ratings yet
Chap 7
7 pages
Chapter Three
No ratings yet
Chapter Three
35 pages
Aspal PKM
No ratings yet
Aspal PKM
8 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
7 pages
R Statistics Practical Guide
No ratings yet
R Statistics Practical Guide
56 pages
Confidence Interval Estimation Techniques
No ratings yet
Confidence Interval Estimation Techniques
5 pages
Linear and Polynomial Regression Guide
No ratings yet
Linear and Polynomial Regression Guide
15 pages
Lecture 06
No ratings yet
Lecture 06
14 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Notes 3008
No ratings yet
Notes 3008
6 pages
Câu 23
No ratings yet
Câu 23
11 pages
WST 311 Notes Part 2 2024
No ratings yet
WST 311 Notes Part 2 2024
21 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Multiregression
No ratings yet
Multiregression
34 pages
Lecture 4
No ratings yet
Lecture 4
11 pages
Lecture 2 Multivariate Linear Regression Models
No ratings yet
Lecture 2 Multivariate Linear Regression Models
15 pages
ORBS7290 Regression Analysis
No ratings yet
ORBS7290 Regression Analysis
8 pages
Module 3 Notes
No ratings yet
Module 3 Notes
37 pages
Stata Plus
No ratings yet
Stata Plus
61 pages
EC501 Lecture 03
No ratings yet
EC501 Lecture 03
30 pages
Theo Assignment 2 New
No ratings yet
Theo Assignment 2 New
10 pages
Econometrics I 20
No ratings yet
Econometrics I 20
56 pages
CH 10
No ratings yet
CH 10
71 pages
(Ebook PDF) Using Multivariate Statistics 7th Edition by Barbara G. Tabachnick Download
100% (2)
(Ebook PDF) Using Multivariate Statistics 7th Edition by Barbara G. Tabachnick Download
55 pages
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
100% (1)
Basic Econometrics 4th Edition Damodar N. Gujarati PDF Download
54 pages
B.Sc. in Data Science and Analytics
No ratings yet
B.Sc. in Data Science and Analytics
14 pages
f23 Econ103 Week2 Ta Note
No ratings yet
f23 Econ103 Week2 Ta Note
5 pages
Metakan
No ratings yet
Metakan
15 pages

Formula Sheet

Uploaded by

Formula Sheet

Uploaded by

Technical University Munich MA4401 Applied Regression

Department of Mathematics Formula Sheet

Prof. Donna Ankerst Winter Term 2021/2022

1 Simple linear regression

SSregression SSresidual s4xy

2 Multiple linear regression

 Least squares estimate b = (b0 , b1 , . . . , bk )′ = (X ′ X) X ′ y , b ∼ N (β, σ 2 (X ′ X)−1 )

 High leverage if hii > 2(k + 1)/n

5 Lack of t, transformations

 Probability of survival in j th inteval: 1 − dj /n′j = (n′j − dj )/n′j

 Median survival time t(50): t(50) = min(observed time|S(time) ≤ 0.50)

 Cox proportional hazard model equation: h(t|X) = P (T = t|T ≥ t, X) = h0 (t) exp(Xβ)

8 Linear mixed models

 Best linear unbiased predictor: Ŷi = Xi β + Zi DZi′ Vi−1 (Yi − Xi β)

 Raw residuals for individual i: ri = Yi − Xi βb , βb the mle for β

 Random eect predictions: bbi = DZ

9 Logistic and Poisson regression

Poisson regression Y ∼ P ois(µ) with log(µ) = β0 + β1 X1 + ... + βk Xk , k number of covariates

 For τ = 0, 1, . . . , T − 1, empirical lag-τ covariance between 2 stations:

number of elements in N (·), here the number of pairs

orthogonal functions (EOFs): eigenvectors ψk ∈ Rm that are columns of Ψ, eigenvalues: diagonal

 Estimate: β̂gls = (X ′ CZ−1 X)−1 X ′ CZ−1 Z

predictor: Ŷ (s0 ; t0 ) = x(s0 ; t0 )′ β̂gls + c′0 Cz−1 (Z − X β̂gls ), variance: σY,uk

 Spatio-temporal covariance function: c∗ (s, s′ ; t, t′ ) = cov(Y (s; t), Y (s′ , t′ ))

 Posterior predictive distribution: [Zppd |Z] =

 Empirical marginal distribution: [Zemp ] = [Zemp |Y, θ̂][Y |θ̂]dY

 Mean squared prediction error: M SP E = T1m Tj=1 m 2

j=1 [Z|Mj ]P (Mj )

 Akaike information criterion: AIC(Ml ) = −2 log[Z|θ̂, Ml ] + 2pl , pl is number of parameters

You might also like

Least squares estimate b = (b0 , b1 , . . . , bk )′ = (X ′ X) X ′ y , b ∼ N (β, σ 2 (X ′ X)−1 )

High leverage if hii > 2(k + 1)/n

5 Lack of t, transformations

Probability of survival in j th inteval: 1 − dj /n′j = (n′j − dj )/n′j

Median survival time t(50): t(50) = min(observed time|S(time) ≤ 0.50)

Cox proportional hazard model equation: h(t|X) = P (T = t|T ≥ t, X) = h0 (t) exp(Xβ)

Best linear unbiased predictor: Ŷi = Xi β + Zi DZi′ Vi−1 (Yi − Xi β)

Raw residuals for individual i: ri = Yi − Xi βb , βb the mle for β

Random eect predictions: bbi = DZ

For τ = 0, 1, . . . , T − 1, empirical lag-τ covariance between 2 stations:

Estimate: β̂gls = (X ′ CZ−1 X)−1 X ′ CZ−1 Z

Spatio-temporal covariance function: c∗ (s, s′ ; t, t′ ) = cov(Y (s; t), Y (s′ , t′ ))

Posterior predictive distribution: [Zppd |Z] =

Empirical marginal distribution: [Zemp ] = [Zemp |Y, θ̂][Y |θ̂]dY

Mean squared prediction error: M SP E = T1m Tj=1 m 2

Akaike information criterion: AIC(Ml ) = −2 log[Z|θ̂, Ml ] + 2pl , pl is number of parameters