Multivariate Distribution Theory
By
Dr.Richard Tuyiragize
School of Statistics and Planning
Makerere University
March 1, 2022
1 The multivariate pdf
If the probability distribution function (pdf) of a p-variate random vector, denoted as
′
X = x1 , x2 , · · · , xn is defined as the joint multivariate pdf of its components random
variables, then, we say that f (x1 , x2 , · · · , · · · , xp ) of f ((X)) are values assumed by the ran-
dom vector of the p random components.
In order to efficiently analyze the multivariate pdf, the random vector is divided into groups
or blocks called partitions. Partitioning can be considered simply as a data management
tool, which helps to organize the transfer of information between the main matrix and the
sub matrices. This is because the basic manipulations can be applied to the smaller blocks
just like the main matrix.
Marginal distribution
A partition may consist of one or more component random variables. For two partitions,
consider;
x1
x2
..
.
X1
′
X = xr =⇒ X =
X2
xr+1
.
..
xp
The marginal pdf of a partition X 1 is defined
as the joint marginal pdf of its component
random variables i.e. x1 , x2 , · · · , xr such that
X 1 ∼ f (X 1 ) = f x1 , x2 , · · · , xr
Z xr+2 Z xr+4 Z xp
Thus; f (X 1 ) = ... f (X) dxr+1 dxr+2 . . . dxp
xr+1 xr+3 xp−1
That is, the original pdf of the p-variate random vector after integrating out the undesired
partition.
Z x2 Z x4 Z xr
The marginal pdf of the partition X 2 ; f (X 2 ) = ... f (X) dx1 dx2 . . . dxr
x1 x3 xr−1
STA3120 1 Email:[email protected]
Conditional distribution
For two random variables, X and Y with a joint pdf of f (XY ), we define the conditional
distribution of Y /X as,
f (XY )
f (Y /X) = f (X)
, where f (X) is the marginal pdf of X
If random variables x1 , x2 , · · · , x5 have a joint multivariate pdf of f (x1 x2 x3 x4 x5 ). To find the
joint conditional distribution of x1 , x2 and x5 given x3 and x4 .
f (x1 x2 x3 x4 x5 )
Thus; f (x1 x2 x5 /x3 x4 ) = f (x3 x4 )
Question
Random variables w, x, y and z have a joint pdf as;
16wxyz, 0 ≤ wxyz ≤ 1
f (xwyz) =
0, elsewhere
1. Obtain the joint marginal distribution for w and z
2. Hence, obtain the conditional distribtion of x and y given w and z
Solution
Z 1 Z 1 Z 1 Z 1
f (wz) = 16wxyz dxdy = 16wz xy dxdy
0 0 0 0
Z 1 1 1
1 2 1 2
f (wz) = 16wz y x = 8wz. y = 4wz
0 2 0 2 0
4wz, 0 ≤ w, z ≤ 1
f (w, z) =
0, elsewhere
Similarly,
f (wxyz) 16wxyz
f (xy/wz) = = = 4xy
f (wz) 4wz
4xy, 0 ≤ xy ≤ 1
f (xy/wz) =
0, elsewhere
STA3120 2 Email:[email protected]
Stochastic independence
′
For a p-variate random vector X = x1 , x2 , · · · , xn with a pdf f (X). The component
random variables are said to be stochastically independent if;
f (x1 ).f (x2 ).f (x3 ) · · · .f (xp )
where f (xi ) is the marginal pdf of X
Question
Random variables x, y, z have a joint pdf
e−(x+y+z) , xyz ≥ 0
f (xyz) =
0, elsewhere
Show that the three variables are independent.
Solution
Let f (x), f (y) and f (z) be the marginal pdf of x, y, z respectively.
Required is to show that f (xyz) = f (x).f (y).f (z)
Z Z Z ∞ Z ∞
∞ ∞
−(x+y+z)
f (x) = f (xyz) dydz = e dydz = e . − e . − e = e−x
−x −z
−y
y z y z 0 0
Thus;
f (x) = e−x , f orx > 0
f (y) = e−y , f orx > 0
f (z) = e−z , f orx > 0
∴ f (x).f (y).f (z) = e−x .e−y .e−z = e−(x+y+z) = f (xyz)
2 Multivariate normal distribution
In univariate statistics, we usually consider the random variable Xof interest as being normal
with parameters µ and σ to have a distribution, X ∼ N (µ, σ 2 ) and its pdf denoted as;
1 1 x−µ 2
f (x) = √ e− 2 ( σ ) −∞<x<∞
σ 2π
STA3120 3 Email:[email protected]
′
Consider the components of a p-variate random vector X = x1 , x2 , · · · , xp as being
identically independent univariate normal random variables. Then xj ∼ N (µj , σj2 ) such that
xj −µj 2
1 − 12 ( )
f (xj ) = √ e σj
− ∞ < xj < ∞
σj 2π
Thus, for identically independent random variables;
n
Y
f (x) = f (x1 ).f (x2 ). · · · . · · · .f (xp ) = f (xj ) j = 1, 2, 3, · · · , p
i=1
where f (xj ) is the marginal distribution of xj such that,
1 1 x1 −µ1 2 1 1 x2 −µ2 2 1 x −µ
− 1 ( p p )2
f (x) = √ e − 2 ( σ 1 ) . √ e − 2 ( σ2 ) . · · · . · · · . √ e 2 σp
σ1 2π σ2 2π σp 2π
p
x1 −µ1 2
− 12
P
1 ( σ1
)
f (x) = p e j=1
p
Y
(2π) 2 σj
i=1
Recall that we assumed component random variables xj to be identically independent. This
impies that their population variance covariance matrix, Σ = Diagp (σj2 )
σ11 0 ··· 0 ··· 0
0 σ22 · · · 0 ··· 0
Σ=
.. .. .. .. .. ..
. . . . . .
0 0 ··· 0 ··· σpp
p p
Y 1
Y
Let |Σ| = σ12 .σ22 . · · · .··· .σp2 = σj2 −→ |Σ| = 2 σj
i=1 i=1
p
x1 −µ1 2
− 21
P
1 ( σ1
)
Then, f (x) = p 1 e
j=1
(2π) 2 |Σ| 2
p
xj −µj 2 xj −µj x −µ ′
= ( x1σ−µ ) + ( x2σ−µ
1 2
) + · · · + ( xpσ−µ
2 2 p 2
P
But also, ( σj
) 1 2 p
) =( σj
)( jσj j )
j=1
STA3120 4 Email:[email protected]
( x1σ−µ 1
1
)
( x2 −µ2 )
σ2
x1 − µ 1 x2 − µ 2 xp − µ p
⇒ ( )( )···( ) ..
σ1 σ2 σp .
xp −µp
( σp )
1
σ12
0 ··· 0 ··· 0
x1 − µ 1
1
x2 − µ 2
0 ··· 0 ··· 0
σ22
⇒ (x1 − µ1 x2 − µ2 · · · xp − µp )
..
.
. .. .. .. .. .. .
. . . . . .
1
xp − µ p
0 0 ··· 0 ··· σp2
′
⇒ (x − µ) Σ−1 (x − µ)
1 1 ′ −1 (x−µ)
∴ f (x) = p 1 e− 2 (x−µ) Σ ⇒ X ∼ Np (µ, Σpxp )
(2π) |Σ|
2 2
Note that if X ∼ Np (0, Ip ), i.e. µ =0 and Σ = Ip , then X is the standard multivariate
random vector denoted as Z
Properties of a multivariate normal random vector
1. Distribution of linear compounds
Let X ∼ Np (µ, Σ), if we define q linear combinations of the components of X such
that
p
P
Y1 = a11 X1 + a12 X2 + · · · + a1p Xp = a1j Xj = a1j X
j=1
Pp
Y2 = a21 X1 + a22 X2 + · · · + a2p Xp = a2j Xj = a2j X
j=1
..
.
p
P
Yq = aq1 X1 + aq2 X2 + · · · + aqp Xp = aqj Xj = aqj X
j=1
In matrix for:
STA3120 5 Email:[email protected]
a11 a12 · · · · · · a1p
Y1 X1
Y 2
a21 a22 · · · · · · a2p X 2
.. = . ⇐⇒ Y = AX
. .. .. . . . . .. ..
. . . . .
Yq Xq
ap1 ap2 · · · · · · aqp
The new random vector Y which is a vector of linear combinations of the components
of vector X is called a Linear compound of the original random vector X
′
The linear compound Y has a q-variate distribution given by Y ∼ Nq (Aµ, AΣA ),
where µ and σ are parameters of X ∼ Np (µ, Σ).
Proof
For mean vector Y , by definition µY = E(Y );
µY = E(AX). since Y = AX
= AE(X)
= Aµ
′
Varaince Covariance matrix, ΣY = E[Y − E(Y )][Y − E(Y )]
′
ΣY = E[AX − Aµ][AX − Aµ]
′
E[A(X − µ][A(X − µ]
′ ′ ′
Recall that for two matrices, A and B, (AB) = B A
′ ′
→ ΣY = E[A(X − µ)][(X − µ) A ]
′ ′ ′
= AE(X − µ)(X − µ) A = AΣX A
Example
0 1 0
Given X ∼ N = ,
1 0 9
Find the row distribution for Y1 = 2X1 − X2 and Y2 = 3X2
STA3120 6 Email:[email protected]
Solution
Y1 2 −1 X1
= → Y = AX
Y2 0 3 X2
′
Since, Y ∼ N (Aµ, AΣA )
2 −1 0 −1
µY = Aµ = =
0 3 1 3
′
′ 2 −1 1 0 2 −1 13 −27
Also, ΣY = AΣA = =
0 3 0 9 0 3 −27 81
−1 13 −27
Then, Y ∼ N ,
3 −27 81
2. Distribution of a simple linear combination
′
X ∼ Np (µ, Σ), then Y ∼ N (Aµ, AΣA )
′
From property 1 if q = 1, the A reduces to a 1xp simple row vector a = a1 , a2 , · · · , ap
′ ′ ′
and the Y = a X is distributed as N1 (a µ, a Σa).
′
where a X = a1 X1 + a2 X2 + · · · + ap Xp
′
i.e. Y being a linear function of the components X1 , X2 , · · · , Xp and a being an arbi-
trary vector of constants.
3. Distribution of individual components of linear combination
′
If a = 1, 0, · · · , 0 then
X1
X 2
′
a X = 1, 0, · · · , 0 .. = X1
.
Xp
′
Likewise, a X = µ1
STA3120 7 Email:[email protected]
σ11 σ12 · · · ··· σ1p
σ21 σ22 · · · · · · σ2p 1
0
′
a Σa = 1, 0, · · · , 0 . = σ11
.. .. . . .. .. ..,
. . . . .
0
σp1 σp2 · · · · · · σpp
Then, X 1 ∼ Nq=1 (µ1 , σ11 )
′
where a = 0, 1, · · · , 0 , then
′
a X = X2
′
a µ = µ2
′
a Σa = σ22
Generalizations
Each component Xj of the multivariate random vector X has a univariate normal distribu-
tion Xj ∼ N (µj , σj )
The joint marginal distribution for any two components Xj and Xk is a bivariate normal
distribution.
Xj µj σjj σjk
∼N ,
Xk µk σkj σkk
For j, k, and l
Xj µj σjj σjk σjl
Xk ∼ N µk , σkj σkk σkk
Xl µl σlj σlk σll
Example
0 3 2 1 4
2 2 5 2 0
Given, X ∼ N4
1 , 1
3 3 1
1 4 0 1 2
1. State the distribution of X3
2. then find the joint distribution of X2 and X4
STA3120 8 Email:[email protected]
Partitioning multivariate normal random vector
Let X ∼ Np (µ, Σ). If we correspondingly partition X, µ and Σ as
x1 µ1
x2 µ2
. .
. .
. .
X1 µ1
X = xr =⇒ X = µ = µr =⇒ µ =
X 2 µ2
xr+1 µr+1
. .
.. ..
x µp
p
σ11 · · · σ1r σ1r+1 · · · σ1p
.. ..
.
..
.
..
.
..
.
..
. .
σr1 · · · σrr σrr+1 · · · σrp Σ11 Σ12
Σ= =
σr+11 · · · σr+1r σr+1r+1 · · · σr+1p Σ21 Σ22
. . . . . .
.. .. .. .. .. ..
σp1 · · · · · · ··· · · · σpp
Marginal distribution of a partition
The first partition X 1 of order rx1 has the distribution
X 1 ∼ Nr (µ1 , Σ11 )
The second partition X 2 of order p − rx1 has the distribution
X 2 ∼ Np−r (µ2 , Σ22 )
Conditional distribution of a partition
The conditional distribution of partition X 1 given that X 2 = x2
X 1 /X 2 = x2 ∼ Nr (µ1.2 , Σ11.2 )
where
µ1.2 = µ1 + Σ12 Σ22 (x2 − µ2 )
Σ11.2 = Σ11 − Σ12 Σ−1
22 Σ21
Note that µ1.2 and Σ11.2 are respectively called the conditional mean vector and conditional
variance covariance matrix of X 1 /X 2 = x2
STA3120 9 Email:[email protected]
Independence of partitions
For bivariate cases, two random variables X1 and X2 are said to be independent if their
covariance is zero i.e.
Cov(X1 X2 ) = 0orE(X1 − µ1 )(X2 − µ2 ) = 0
Generally
The partitions X 1 and X 2 of a multivariate normal random vector X are said to indepen-
dent if their covariance is zero.
Σ(p−r)xr = Σrx(p−r) = 0, E(X 1 − µ1 )(X 2 − µ2 ) = 0
Example
0 3 2 0
Given, X ∼ N3 2 , 2 1 0
1 0 0 4
1. State the joint distribution of X1 and X2
2. state the distribution of X3
3. Find the conditional distribution of X1 and X2 given X3 = x3
4. Determine whether X1 and X2 are jointly independent of X3
Solution
Partition X, µ and Σ appropriately as;
x1 0 3 2 0
x2 = X 1 ∼ N3 2 2 1 0
X2
x3 1 0 0 4
X1 µ1 σ11 σ12
∼N ,
X2 µ2 σ21 σ22
1. The marginal distribution of partition X 1 ,
STA3120 10 Email:[email protected]
X 1 ∼ N (µ1 , Σ11 )
x1 0 3 2
∴ ∼N ,
x2 2 2 1
2. The distribution of X3 ; X3 ∼ N2 (1, 4) → X 2 ∼ N2 (µ2 , Σ22 )
3. The conditional distribution of partition, X 1 /X 2 = x2 ;
X 1 /X 2 = x2 ∼ Nr (µ1.2 , Σ11.2 )
where
0 0 −1 0
µ1.2 = µ1 + Σ12 Σ22 (x2 − µ2 ) = µ1.2 = + 4 (x3 − 1) =
2 0 2
3 2 0 −1 3 2
Σ11.2 = Σ11 − Σ12 Σ−1
22 Σ21 = - 4 0 0 =
2 1 0 2 1
4. Independence
0
Σ(p−r)xr = Σrx(p−r) = 0; Σ12 = and Σ21 = 0 0
0
Question
0 1 2 0 0
0 0 1 0 0
Let, X ∼ N4
0 , 0
0 1 3
0 0 0 2 4
By way of partitioning or otherwise, state or find;
X1
1. The marginal distribution of
X2
X1 X3 x3
2. The conditional distribution of given =
X2 X4 x4
X1 X3
3. Determine whether and are independent.
X2 X4
Answers
STA3120 11 Email:[email protected]
X1 0 1 2
1. ∼ N2 ,
X2 0 0 1
X1 X3 x3 0 1 2
2. given = ∼ N2 ,
X2 X4 x4 0 0 1
0 X1 X3
3. Σ12 = and Σ21 = 0 0 ∴ and are independent.
0 X2 X4
STA3120 12 Email:[email protected]