Distribution of Quadratic forms
Theorem 1: Fisher-Cochran Theorem: Let y i , i = 1,2, . . .,n be n independent standard
normal variates i.e. y i ~ N(0, 1) and Y′ = (y 1 ,y 2 , . . .,y n ). Then Y ~ N(0, I). Let Q i , i = 1,2,
. . .,k be k quadratic forms in (y 1 ,y 2 , . . .,y n ) with ranks n1 , n2 , , nk such that
Y′Y = 𝑄𝑄𝟏𝟏 + 𝑄𝑄𝟐𝟐 + ⋯ + 𝑄𝑄𝒌𝒌 .
Then the necessary and sufficient condition for Q i , ( i = 1, 2, , k ) to be
independently distributed as χ2 variates with ni ( i = 1, 2, , k ) degrees of freedom is that
n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 .
Proof: Necessary part: Given that Qi , i = 1, 2, , k are independent χ2 variates with ni
(𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘) and to prove that n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 .
Since Q = Y′Y =𝑄𝑄𝟏𝟏 + 𝑄𝑄𝟐𝟐 + ⋯ + 𝑄𝑄𝒌𝒌 , the L.H.S of this equation is a χ2 variate with n
degrees of freedom as the vector Y is of order n and each component of it is a standard
normal variate. Also Qi , 𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘 are independent χ2 variates with ni ( 𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘)
degrees of freedom. Hence, by additive property of χ2 variates, 𝑄𝑄𝟏𝟏 + 𝑄𝑄𝟐𝟐 + ⋯ + 𝑄𝑄𝒌𝒌 is an
independent χ2 variate with n1 + n2 + + nk d.f. Consequently, n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 .
Sufficiency part: Here , given n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 . To prove that Qi , 𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘 are
independent χ2 variates with ni ( 𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘) degrees of freedom.
To prove this, consider a quadratic form Y′AY of rank m. i.e. r(A) = m ≤ n. Also, we
know that the matrix associated with a quadratic form is a symmetric matrix, hence ∃ an
orthogonal matrix T such that T′AT = Λ, where Λ is a diagonal matrix with diagonal
elements as the characteristic roots of the matrix A. We now define a diagonal matrix S
whose ith diagonal element is
1
(i) (𝜆𝜆𝑖𝑖 )−2 if 𝜆𝜆𝑖𝑖 > 0,
1
−2
(ii) (−𝜆𝜆𝑖𝑖 ) if 𝜆𝜆𝑖𝑖 < 0
and (iii) 1 if 𝜆𝜆𝑖𝑖 = 0.
Then S′T′ATS = S′ΛS. If 𝜆𝜆1 = -2, then
1
𝑠𝑠1 = (2)−2 . Hence,
1 1
s1′λ1s1 = (2)−2 (−2)(2)−2 = -1. Thus, S′T′ATS = �(𝛿𝛿𝑖𝑖 )�, where 𝛿𝛿𝑖𝑖 are +1, -1 or 0.
Now, let P = TS be a non-singular matrix . Then P′AP = �(𝛿𝛿𝑖𝑖 )�. Without loss of
generality let us assume that (n-m) characteristic roots of the matrix A to be zero. Then,
by taking
Y = PX, we have
Y′AY = (PX)′APX = X′P′APX
= ±𝑥𝑥12 ± 𝑥𝑥22 ± ⋯ ± 𝑥𝑥𝑚𝑚 2
.
In our case, Q 1 is a quadratic form of rank n 1 , hence it can be expressed as
Q 1 = ±𝑥𝑥12 ± 𝑥𝑥22 ± ⋯ ± 𝑥𝑥𝑛𝑛21 .
As Y = PX ⇒ X = P-1Y, X is a linear combination of the components of Y.
i.e.
𝑥𝑥1 = 𝑏𝑏11 𝑦𝑦1 + 𝑏𝑏12 𝑦𝑦2 + ⋯ + 𝑏𝑏1𝑛𝑛 𝑦𝑦𝑛𝑛
⋮ .
𝑥𝑥𝑛𝑛 1 = 𝑏𝑏𝑛𝑛 1 1 𝑦𝑦1 + 𝑏𝑏𝑛𝑛 1 2 𝑦𝑦2 + ⋯ + 𝑏𝑏𝑛𝑛 1 𝑛𝑛 𝑦𝑦𝑛𝑛
Similarly, Q 2 = ±𝑥𝑥𝑛𝑛21 +1 ± 𝑥𝑥𝑛𝑛21 +2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +𝑛𝑛 2 ,
where,
𝑥𝑥𝑛𝑛 1 +1 = 𝑏𝑏𝑛𝑛 1 +1,1 𝑦𝑦1 + 𝑏𝑏𝑛𝑛 1 +1,2 𝑦𝑦2 + ⋯ + 𝑏𝑏𝑛𝑛 1 +1,𝑛𝑛 𝑦𝑦𝑛𝑛
⋮ .
𝑥𝑥𝑛𝑛 1 +𝑛𝑛 2 = 𝑏𝑏𝑛𝑛 1 +𝑛𝑛 2 ,1 𝑦𝑦1 + 𝑏𝑏𝑛𝑛 1 +𝑛𝑛 2 ,2 𝑦𝑦2 + ⋯ + 𝑏𝑏𝑛𝑛 1 +𝑛𝑛 2 ,𝑛𝑛 𝑦𝑦𝑛𝑛
Proceeding this way,
Qi =
±𝑥𝑥𝑛𝑛21 +1 ± 𝑥𝑥𝑛𝑛21 +2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +𝑛𝑛 2 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑖𝑖−1 +1 ± 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑖𝑖−1 +2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑖𝑖−1 +𝑛𝑛 𝑖𝑖 ,
where,
𝑥𝑥𝑛𝑛 1 +⋯+𝑛𝑛 𝑖𝑖 = 𝑏𝑏𝑛𝑛 1 +⋯+𝑛𝑛 𝑖𝑖 ,1 𝑦𝑦1 + 𝑏𝑏𝑛𝑛 1 +⋯+𝑛𝑛 𝑖𝑖 ,2 𝑦𝑦2 + ⋯ + 𝑏𝑏𝑛𝑛 1 +⋯+𝑛𝑛 𝑖𝑖 ,𝑛𝑛 𝑦𝑦𝑛𝑛 .
This is true for all 𝑖𝑖 = 1,2, ⋯ , 𝑘𝑘. Since, n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 , the n linear relationships can be put
into a compact form as X = BY.
Also since
Q1 + Q2 + + Qk = Y′A 1 Y + Y′A 2 Y + . . . + Y′A k Y
=±𝑥𝑥12 ± 𝑥𝑥22 ± ⋯ ± 𝑥𝑥𝑛𝑛21 ± 𝑥𝑥𝑛𝑛21 +1 ± 𝑥𝑥𝑛𝑛21 +2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +𝑛𝑛 2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +1 ± 𝑥𝑥𝑛𝑛21 +2 ± ⋯ ±
𝑥𝑥𝑛𝑛21 +𝑛𝑛 2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑘𝑘−1 +1 ± 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑘𝑘−1 +2 ± ⋯ ± 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑘𝑘−1 +𝑛𝑛 𝑘𝑘
= X′ ΔX , where Δ is a diagonal matrix with elements +1 or -1. We also have
Q = Y′Y = (B-1X)′(B-1X) = X′B-1′ B-1X = X′(BB′)-1X.
Again since Q = Q1 + Q2 + + Qk ,
X′(BB′)-1X = X′ ΔX.
2
Hence, (BB′)-1 = Δ. Now, since BB′ is a positive definite matrix, all its elements are
greater than zero and consequently all the elements of Δ are +1, i.e.
Δ = I n . This implies (BB′)-1 = I, which further implies that BB′ = I i.e. B is an
orthogonal matrix and X = BY is an orthogonal transformation. Hence,
X ~ N(0, BB′) ≡ N(0, I) i.e. xi , i = 1, , n are independent standard normal variates.
Since,
n = ∑𝑘𝑘𝑖𝑖=1 𝑛𝑛𝑖𝑖 , ∴ ∑𝑘𝑘𝑖𝑖=1 𝑄𝑄𝑖𝑖 = ± x12 ± x22 ± ± xn2 .
Also we have shown that
Q 1 is reduced to 𝑥𝑥12 + 𝑥𝑥22 + ⋯ + 𝑥𝑥𝑛𝑛21 ,
Q 2 is reduced to 𝑥𝑥𝑛𝑛21 +1 + 𝑥𝑥𝑛𝑛21 +2 + ⋯ + 𝑥𝑥𝑛𝑛21 +𝑛𝑛 2
Q k is reduced to 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑘𝑘−1 +1 + ⋯ + 𝑥𝑥𝑛𝑛21 +⋯+𝑛𝑛 𝑘𝑘 .
And also since xi2 are standard normal variates, therefore
Q 1 is a χ2 variate with n 1 degrees of freedom,
Q 2 is a χ2 variate with n 2 degrees of freedom,
Q k is a χ2 variate with n k degrees of freedom.
Also they are all independently distributed since x1 , x2 , xn1 , , xn1 ++ nk are all
independent. Hence the theorem.
Theorem 2: Let Y be a multivariate which is normally distributed with mean zero and
variance I i.e. Y~ N(0, I) and Y′AY is a quadratic form of rank k. Then the necessary and
sufficient condition for the quadratic form Y′AY to be distributed as a χ2 variate with k
degrees of freedom is that A is an idempotent matrix.
Proof: Sufficiency part: Given A is an idempotent matrix and to prove that Y′AY is a χ2
variate with k d.f.
3
As A is an idempotent matrix, therefore the characteristic roots of A will be either
0 or 1. Also, since the rank of A is k, hence k of its roots will be 1 and (n-k) of them will
be 0.
Without loss of generality, let us assume that the first k ch.roots are 1, while the
last (n - k) roots are 0. Further, since A is a symmetric matrix, therefore ∃ an orthogonal
matrix C such that
𝑰𝑰𝑘𝑘 𝟎𝟎
C′AC = � � .
𝟎𝟎 𝟎𝟎
𝑰𝑰 𝟎𝟎
Let Y = CX. Then, Y′AY = X′C′ACX = X′� 𝑘𝑘 �X
𝟎𝟎 𝟎𝟎
= 𝑥𝑥12 + 𝑥𝑥22 + ⋯ + 𝑥𝑥𝑘𝑘2 .
Since, Y = CX and Y~N(0, I), i.e. E(Y) = 0 and
V(Y) = I,
∴ X = C-1Y and E(X) = C-1E(Y) = 0
and V(X) = V(C-1Y) = C-1V(Y)( C-1)′ = C′IC
[since C is orthogonal, C-1 = C′]
= C′C = I.
∴ X ~ N(0, I) ⇒ 𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑘𝑘 are independent standard normal variates and
Y′AY = ∑𝑘𝑘𝑖𝑖=1 𝑥𝑥𝑖𝑖2 ~ χ2 variate with k degrees of freedom.
Necessary part: Given Y′AY ~ 𝜒𝜒𝑘𝑘2 and to prove that A is an idempotent matrix.
Since A is a matrix associated with the quadratic form Y′AY, hence A is a symmetric
matrix and therefore ∃ an orthogonal matrix C such that
𝜆𝜆1 0 ⋯ 0 ⋯ 0
⎡ ⎤
0 𝜆𝜆2 ⋯ 0 ⋯ 0
⎢ ⎥
⎢ ⋮ ⋮ ⋱ ⋮ ⋯ ⋮⎥
C′AC = ,
⎢ 0 0 ⋯ 𝜆𝜆𝑚𝑚 ⋯ 0⎥
⎢⋮ ⋮ ⋮ ⋮ ⋮ ⋮⎥
⎣0 0 ⋯ 0 ⋯ 0⎦
where λ1 , λ2 , , λm are the non-zero characteristic roots of the matrix A and the rest
(n – m) ch.roots are zero. Let Y = CX.
4
𝜆𝜆1 0 ⋯ 0 ⋯ 0
⎡ ⎤
0 𝜆𝜆2 ⋯ 0 ⋯ 0⎥
⎢
⋮ ⋱ ⋮ ⋯
∴Y′AY = X′C′ACX = X′ ⎢ ⋮ ⋮ ⎥X
⎢0 0 ⋯ 𝜆𝜆𝑚𝑚 ⋯ 0⎥
⎢⋮ ⋮ ⋮ ⋮ ⋮ ⋮⎥
⎣0 0 ⋯ 0 ⋯ 0⎦
= 𝜆𝜆1 𝑥𝑥12 + 𝜆𝜆2 𝑥𝑥22 + ⋯ + 𝜆𝜆𝑚𝑚 𝑥𝑥𝑚𝑚
2
. (1)
Also, since Y = CX is an orthogonal transformation, therefore 𝑥𝑥𝑖𝑖 , 𝑖𝑖 = 1,2, ⋯ , 𝑛𝑛 are
independent standard normal variates i.e. 𝑥𝑥𝑖𝑖2 is a χ2 variate with 1d.f. and its
1
characteristic function is given by (1 − 2𝑖𝑖𝑖𝑖)−2 . The characteristic function of 𝜆𝜆𝑗𝑗 𝑥𝑥𝑗𝑗2 will
1
−2
be �1 − 2𝑖𝑖𝜆𝜆𝑗𝑗 𝑡𝑡� and since 𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑚𝑚 are independently distributed, therefore the
characteristic function of 𝜆𝜆1 𝑥𝑥12 + 𝜆𝜆2 𝑥𝑥22 + ⋯ + 𝜆𝜆𝑚𝑚 𝑥𝑥𝑚𝑚
2
will be equal to
1
−
∏𝑚𝑚
𝑗𝑗 =1 �1 − 2𝑖𝑖𝜆𝜆𝑗𝑗 𝑡𝑡� .
2
Further, since Y′AY is a χ2 with k d.f., hence its characteristic function is given by
𝑘𝑘
(1 − 2𝑖𝑖𝑖𝑖)−2 .
From (1), we have
𝑘𝑘 1
−
(1 − 2𝑖𝑖𝑖𝑖)−2 = ∏𝑚𝑚
𝑗𝑗 =1 �1 − 2𝑖𝑖𝜆𝜆𝑗𝑗 𝑡𝑡� .
2
Now, this equality will hold only when m=k and λ j =1,
i.e. Y′AY = 𝑥𝑥12 + 𝑥𝑥22 + ⋯ + 𝑥𝑥𝑘𝑘2 and
𝑰𝑰𝑘𝑘 𝟎𝟎
C′AC = � �.
𝟎𝟎 𝟎𝟎
Also C′A2C = C′AC C′AC
𝑰𝑰𝑘𝑘 𝟎𝟎 𝑰𝑰𝑘𝑘 𝟎𝟎 𝑰𝑰 𝟎𝟎
=� �� � = � 𝑘𝑘 �= C′AC,
𝟎𝟎 𝟎𝟎 𝟎𝟎 𝟎𝟎 𝟎𝟎 𝟎𝟎
which implies A2 = A. Hence, A is idempotent matrix.
Theorem 3: Let Q 1 = Y′A 1 Y be a χ2 with n 1 d.f. and Q 2 = Y′A 2 Y be another χ2 variate
with n 2 d.f. Then a necessary and sufficient condition that they are independently
distributed is A 1 A 2 = 0.
5
Proof: Sufficiency part: Here it is given A 1 A 2 = 0 and to prove that Q 1 and Q 2 are
independently distributed. Now, Q 1 = Y′A 1 Y is a χ2 with n 1 d.f. implies that A 1 is an
idempotent matrix. Similarly, since Q 2 = Y′A 2 Y is a χ2 variate with n 2 d.f. implies that
A 2 is also an idempotent matrix.
Further, (I – A 1 – A 2 ) ( I – A 1 – A 2 )
= I – A1 – A2 – A1 + A12 + A1A2 – A2 + A1A2 + A22
= I – A1 – A2 – A1 + A1 + A1A2 – A2 + A1A2 + A2
= I – A1 – A2 [ A 1 A 2 = 0 ].
Hence, ( I – A 1 – A 2 ) is an idempotent matrix.
Now, I = A 1 + A 2 + I – A 1 – A 2 . Hence,
Y′Y = Y′A 1 Y + Y′A 2 Y + Y′( I – A 1 – A 2 )Y.
Since, A 1 ,A 2 and ( I – A 1 – A 2 ) are idempotent matrices,
tr(I) = tr(A 1 ) + tr(A 2 ) + tr( I – A 1 – A 2 ) = r(A 1 ) + r(A 2 ) + r( I – A 1 – A 2 ).
This implies that Y′A 1 Y and Y′A 2 Y are independently distributed as χ2 variates with n 1
and n 2 d.f. respectively (by Fisher-Cochran Theorem).
Necessary part: Given Q 1 and Q 2 are independently distributed and to prove that
A 1 A 2 = 0.
Since, Q 1 and Q 2 are independent χ2 variates with n 1 and n 2 d.f. respectively,
therefore by additive property of χ2 variates, Q 1 + Q 2 is a χ2 variate with
n 1 + n 2 d.f. Hence, by Theorem 2, A 1 + A 2 is an idempotent matrix.
∴ A 1 + A 2 = (A 1 + A 2 )( A 1 + A 2 )
= A12 + A1A2 + A2A1 + A22
= A1 + A1A2 + A2A1 + A2.
This equality will be true only if A 1 A 2 = 0, as A 1 and A 2 are non-negative matrices.
6
Theorem 4: Let Y′Y = Y′A 1 Y + . . . + Y′A k Y. Then either of the following condition is
necessary and sufficient for Y′A i Y ~ χ2 variate with n i d.f. is the rank of A i and for the
set Y′A 1 Y, . . ., Y′A i Y, . . ., Y′A k Y to be independently distributed:
(a) A 1 , A 2 , . . . ,A k are all idempotent matrices;
(b) A i A j = 0 for all i ≠ j.
Proof: (a) A i , i = 1 , . . . , k are idempotent matrices. Hence, r(A i ) = tr(A i ), i = 1 , . .
. , k. Further since Y′Y = Y′IY = Y′(A 1 + . . . + A k )Y,
∴I = A 1 + . . . + A k
and tr(I) = tr(A 1 )+ . . . + tr(A k )
i.e. r(I) = r(A 1 )+ . . . + r(A k ).
Thus by Fisher-Cochran Theorem, Y′A 1 Y, . . ., Y′A i Y, . . ., Y′A k Y are independently
distributed as χ2 variates with n 1 , . . . , n k d.f. such that
n = n1+ . . . + nk.
(b) Given A i A j = 0 for all i ≠ j. We have already shown that
I = A1 + . . . + Ak.
Post multiplying this equation by A j on both sides,
A j = A 1 A j + . . . + A j A j + . . .+ A k A j
Or A j = A j 2 [since A i A j = 0 for all i ≠ j],
i.e. A j is an idempotent matrix, which is the same condition as that of (a), which has
been already proved. Hence the theorem.
Tests of hypotheses
Let (Y, Xβ, σ2I) be a general linear model, where Y is a multi-normal variate. For
testing a hypothesis we assume that errors are normally distributed. In full rank case,
the hypothesis is
H0: β = β0.
7
Theorem: In the general linear model Y = Xβ + e, where Y~ N(Xβ, σ2I), the quantity
(n − p )Q2
u= is distributed as a non-central F with p and n – p d.f. and non-
pQ1
centrality parameter
( β − β 0 )′S ( β − β 0 )
λ= ,
2σ 2
where Q 1 + Q 2 is minimum with respect to β of
(Y - Xβ)′( Y - Xβ) when H 0 (i.e. β = β 0 ) is true. Q 1 is minimum w.r.t. β when there is
no restriction on β. [i.e. Q 1 + Q 2 = min (Y - X′β)′( Y - X′β)].
β = β0
Proof: If unrestricted (unconditional) residual (error) sum of squares is given by
SSE = (Y - Xβ)′( Y - Xβ),
then the restricted(conditional) error sum of squares under the restriction or condition
β = β 0 is given by
SSE* = (Y - Xβ 0 )′( Y - Xβ 0 )
� + X𝜷𝜷
= (Y - X𝜷𝜷 � - Xβ 0 )′( Y - X𝜷𝜷 � + X𝜷𝜷
� - Xβ 0 )
[adding and subtracting X′𝜷𝜷 �]
� )′ + { X(𝜷𝜷
=[(Y - X𝜷𝜷 � - β 0 )}′]× [ (Y - X𝜷𝜷
� ) + X(𝜷𝜷
� - β 0 )]
= (Y - X𝜷𝜷� )′(Y - X𝜷𝜷
� ) + (Y - X𝜷𝜷� )′ X(𝜷𝜷�- β0)
� - β 0 )′ X′ (Y - X′𝜷𝜷
+ (𝜷𝜷 � ) + (𝜷𝜷
� - β 0 )′ X′X(𝜷𝜷
�- β0)
= (Y - X𝜷𝜷 � )′(Y - X𝜷𝜷
� ) +(Y′X - 𝜷𝜷 � ′ X′X)( 𝜷𝜷
�- β0)
� - β 0 )′( X′Y - X′X𝜷𝜷
+ (𝜷𝜷 � ) + (𝜷𝜷
� - β 0 )′ X′X(𝜷𝜷
� - β 0 ).
� = S-1XY and
Substituting X′X = S, 𝜷𝜷
� ′ = Y′X S-1′ = Y′X S-1 [since S-1 is symmetric ],
𝜷𝜷
SSE* = (Y - X𝜷𝜷 � )′(Y - X𝜷𝜷� ) + (Y′X - Y′X S-1S) (𝜷𝜷 � - β 0 )+
� - β 0 )′( X′Y – SS-1 X′Y)+ (𝜷𝜷
(𝜷𝜷 � - β 0 )′S(𝜷𝜷
�- β0)
� )′(Y - X𝜷𝜷
= (Y - X𝜷𝜷 � ) + (𝜷𝜷
� - β 0 )′S(𝜷𝜷
�- β0)
= SSE + Sum of squares due to regression
8
= Q 1 + Q 2 i.e. Q 1 is SSE and Q 2 is sum of squares due to regression.
Again the restricted (conditional) error sum of squares SSE* can be written as
SSE* = Q 1 + Q 2 = (Y - Xβ 0 )′( Y - Xβ 0 )
= (Y - Xβ 0 )′(I - XS-1X)( Y - Xβ 0 )+(Y - Xβ 0 )′ XS-1 X′ ( Y - Xβ 0 ).
∴ Q 1 = (Y - X′β 0 )′(I - X′S-1X)( Y - X′β 0 )
and Q 2 = (Y - Xβ 0 )′ XS-1 X′ ( Y - Xβ 0 ).
To show that Q 1 and Q 2 are independent χ2 variables, let
� + e - X′β 0 = X′(𝜷𝜷
Z = Y - X′β 0 = X′𝜷𝜷 � - β 0 ) + e.
� , σ2I),
Since Y ~ N(X𝜷𝜷
� - β 0 ), σ2I].
∴ Z ~ N[X(𝜷𝜷
Consequently,
SSE* = Q 1 + Q 2 = Z′Z, where Q 1 = Z′(I - XS-1 X′)Z
and Q 2 = Z′ XS-1 X′Z.
Now, since Q 1 is a quadratic form and the matrix associated with this quadratic form is
Q
an idempotent matrix with rank (n - p), therefore 12 is distributed χ2 with (n – p) d.f. and
σ
non-centrality parameter
� −𝜷𝜷𝟎𝟎 �𝑿𝑿′�𝑰𝑰−𝑿𝑿𝑺𝑺−𝟏𝟏 𝑿𝑿′ �𝑿𝑿�𝜷𝜷
�𝜷𝜷 � −𝜷𝜷𝟎𝟎 �
λ1 = .
2𝜎𝜎 2
[Since (i) if Y~ N(0, I), then Y′Y~ χ n2 ;
(ii) if Y~ N(0, σ2I), then Y′Y/ σ2 ~ χ n2 ;
and (iii) if Y~ N(μ, σ2I), then Y′Y/ σ2 ~ χ n2 (λ ) ,i.e. a non-central χ2 variable with n d.f.
and non-centrality parameter λ = μ′μ/(2 σ2)].
Now, X(I - XS-1 X′)X = X′X - X′XS-1 X′X
Q1
= X′X - X′X = 0 and hence, λ 1 = 0. Therefore is a central χ2 with (n - p) d.f.
σ2
Also, Q 2 is a quadratic form in Z and the matrix associated with this quadratic form is an
idempotent matrix with rank p.
9
Q2
∴ is a non-central χ2 with p degrees of freedom and non-centrality parameter
σ2
� −𝜷𝜷𝟎𝟎 �′ 𝑿𝑿𝑿𝑿′𝑺𝑺−𝟏𝟏 𝑿𝑿𝑿𝑿′ �𝜷𝜷
�𝜷𝜷 � −𝜷𝜷𝟎𝟎 �
λ2 =
2𝜎𝜎 2
� −𝜷𝜷𝟎𝟎 �′𝑺𝑺�𝜷𝜷
�𝜷𝜷 � −𝜷𝜷𝟎𝟎 �
= .
2𝜎𝜎 2
Further since the matrix associated with the quadratic forms Q 1 and Q 2 are idempotent
Q
matrices, Q 1 and Q 2 are independently distributed, the distribution of 12 being a central
σ
Q
χ2 with (n - p) d.f. and that of 22 being a non- central χ2 with p d.f. and non-centrality
σ
parameter
� −𝜷𝜷𝟎𝟎 �′ 𝑺𝑺�𝜷𝜷
�𝜷𝜷 � −𝜷𝜷𝟎𝟎 �
λ2 = . Hence,
2𝜎𝜎 2
𝑄𝑄2 /𝜎𝜎 2 𝑝𝑝 𝑄𝑄2 /𝑝𝑝 (𝑛𝑛−𝑝𝑝)𝑄𝑄2
u= = = is distributed as a non-central F with p and n - p
𝑄𝑄1 /𝜎𝜎 2 (𝑛𝑛−𝑝𝑝) 𝑄𝑄1 /(𝑛𝑛−𝑝𝑝) 𝑝𝑝𝑄𝑄1
� −𝜷𝜷𝟎𝟎 �′𝑺𝑺�𝜷𝜷
�𝜷𝜷 � −𝜷𝜷𝟎𝟎 �
d.f. and non-centrality parameter λ = .
2𝜎𝜎 2
Hence the theorem.
If we assume that our null hypothesis H 0 : β = β 0 is true, then the non-centrality
(𝑛𝑛−𝑝𝑝)𝑄𝑄2
parameter λ is zero and consequently u = is distributed as a central F with p and
𝑝𝑝𝑄𝑄1
n - p d.f.
Analysis of Variance Table
Sources of Degrees of Sum of Mean sum of Variance ratio
variation freedom squares squares
Regression p SSR=Q 2 MSR=SSR/p MSR
F=
Error n-p SSE=Q 1 MSE =SSE/(n-p) MSE
Total n TSS=Q 1 +Q 2
MSE gives the unbiased estimate of σ2 (unknown) irrespective of the fact that whether
our null hypothesis is true or not.
In the case, when X is not a full rank matrix i.e. r(X) = k < p, then we find
estimate to the estimable function ψ = c′β.
10
The null hypothesis here is H 0 : β = 0.Then the residual sum of squares under H 0
is
SSE* = Y′Y = Y′(I - XG X′)Y + Y′XG X′Y
= Q 1 + Q 2 , where Q 1 = SSE.
Now, since Q 1 is a quadratic form and the matrix associated with this quadratic form is
𝑄𝑄1
an idempotent matrix with rank n - k, hence is a non-central χ2 with
𝜎𝜎 2
n – k d.f. and non-centrality parameter
� ′ 𝑿𝑿′ (𝑰𝑰−𝑿𝑿𝑿𝑿𝑿𝑿′ )𝑿𝑿𝜷𝜷
𝜷𝜷 � � ′ (𝑿𝑿′ −𝑿𝑿′𝑿𝑿𝑿𝑿𝑿𝑿′ )𝑿𝑿𝜷𝜷
𝜷𝜷 �
λ1 = =
2𝜎𝜎 2 2𝜎𝜎 2
= 0 [∵ X′XGX = X′].
𝑄𝑄 𝑄𝑄
i.e. 12 is a central χ2 with n – k d,f. Similarly 22 is a non-central χ2 with k d.f. and non-
𝜎𝜎 𝜎𝜎
centrality parameter
� ′ 𝑿𝑿′ (𝑿𝑿𝑿𝑿𝑿𝑿′ )𝑿𝑿𝜷𝜷
𝜷𝜷 � � ′ 𝑿𝑿′ 𝑿𝑿𝜷𝜷
𝜷𝜷 �
λ2 = = [∵ X′XG X′X = X′X].
2𝜎𝜎 2 2𝜎𝜎 2
Further since the matrices associated with Q 1 and Q 2 are idempotent matrices,
Q Q
hence 12 and 22 are independently distributed.
σ σ
Q2 / k
Hence, u = is a non-central F with k and n-k d.f. and the non-centrality
Q1 /(n − k )
parameter is given by
� ′ 𝑿𝑿′ 𝑿𝑿𝜷𝜷
𝜷𝜷 �
λ2 = .
2𝜎𝜎 2
If we assume that our null hypothesis β = 0 is true, then u is distributed as a
central F with k and n – k d.f.
11