The Exact Sampling Distributions
Here, we study certain distributions that arise in sampling from a normal i.e., N(μ, σ2) population.
The entire topic is divided into 5 sub-divisions:
1. Objectives
2. Introduction
3. Types of Exact Sampling distributions
4. Properties of Sampling Distributions
5. Summary
1. Objectives :
i. To introduce the concepts of sampling distributions
ii. To know the use of sampling distributions to conduct tests of significance.
2. Introduction : Any probability distribution and therefore, any sampling distribution can
be partially described by its mean and standard deviation. When the sample size (n)is
small (say below 30) and standard deviation(σ) is unknown, the data no longer follow the
properties of normal distribution and it may be distributed as some other distributions
such as chi-square distribution, t- distribution, F-distribution and etc.
3. Types of Exact Sampling Distributions:
i. Chi-square distribution
ii. Student’s t- distribution
iii. Snedecor’s F-distribution
3.1. Chi-square distribution:
Definition : “The square of a standard normal variate is called a chi-square variate with 1 degree
of freedom”.
That is, if X~ N(μ, σ2), we define Z = (X - μ)/σ ~ N(0, 1), then Z2 ~ χ2 with 1 d.f.
Thus in general, if X1, X2,…, Xn are n independent random variables, distributed as N(μi, σi2),
for i = 1, 2,…, n, then the random variable χ2 defined by
1
2
X i
n
2 i
i 1 i
is said to follow chi-square distribution with n d.f. and the pdf is
1 - 2 / 2
f (2 ) e (2 )n/ 2 1, for 0 ≤2 ∞;
2 n/ 2
n/ 2
Symbolically, χ 2 ~ χ (2n ) d.f.
3.1.a. Chi-square probability curve :
Chi square probability curve is drawn for various values of
degrees of freedom(n). Here we observe that, for the degrees
of freedom n = 1 and 2, the probability curve show
decreasing in chi-square, that is chi-square distribution is
severely skewed to the right, whereas, when n ≥ 3, the chi-
square curves show increasing function of χ2 and also the
curves rapidly tend to be more symmetrical.
3.1.b. Derivation of the p.d.f. of χ2 :
Here, we shall derive the distribution of χ2 using MGF method
Proof : We are given Xi’s(i=1,2,…, n) are independent N(μi, σi2) variates, we want the
distribution of
2
n
X - n
2 ∑ i i ∑Z i2 ,
i 1 i i 1
where Zi = (Xi - μi)/σi ~N(0, 1).
Since Xi’s are independent, and hence Zi’s are also independent. Therefore,
n
M χ 2 (t ) = M (t ) = ∏ M Z 2 (t ) = [ M Z 2 (t )]n , (1)
∑ Z i2
i =1
i i
2
tZ i2 2
where, M Zi2 (t ) E[e ] e tZi f ( xi ) dxi
1 1 [( x ) / ]2
etZi
2
e 2 i i i dxi
2
Since Zi = (Xi - μi)/σi, we have
1
2 2
M Z 2 (t ) etZi e -Zi / 2 dzi ,
i
2
1 2 t 2
1 Zi
2
e 2
dzi
2 2
1
(1 2t ) -1 / 2 , because e a x dx
2 1 - 2t
1/ 2 a
2
Therefore, by equation (1), we have
M 2 (t ) (1 2t ) n / 2 ,
which is the mgf of a Gamma variate, with parameters 1/2 and n/2.
Hence, by uniqueness theorem of mgfs
2
X -
n
∑ i i , is a Gamma variate with parameters1/2 and n/2.
2
i 1 i
Thus, we have the probability differential as
(1/ 2)n / 2 2 / 2 2 n / 2 1 2
2
dF(χ ) = e ( ) d
n / 2
1 - 2 / 2
e (2 )n/ 2 1d2, for 0 ≤2 ∞;
2 n/ 2
n/ 2
which is the required p.d. of chi-square distribution with n d.f.
3
4. Properties of chi-square distribution :
If , the random variable X ~ χ (2n ) d.f., then
1. Mean = n and variance =2n.
2. MGF = (1-2t)-n/2 , provided |2t | < 1.
3. Mode = n – 2 , for n > 2.
4. Sum of k independent chi-square variates with ni d.f. is a chi-square variate with
k
n = ∑ n i d.f.
i =1
5. If X ~ χ2 with n d.f. and Y ~ χ2 with m d.f.,
then (X/Y )~β2(n/ 2 , m/2) and X / (X+ Y) ~ β1(n/ 2 , m/2).
6. Let X ~ U(0, 1), then Y= -2logX has χ2 with 2 d.f.
4.i. Moment generating Function(MGF) of χ2 distribution:
By definition of MGF of a random variable X
∞
M X (t ) = E [ e ] = tX
∫ tx
e f ( x ) dx
∞
2
Since X~ χ distribution with n d.f., we have
1 - 2x n/ 2 -1
MX (t) et x e x dx, 0 ≤x ∞;
0 2n/ 2 n/ 2
1 ∞ - 1(1-2t )x
∫ e 2 xn/ 2 -1dx, ;
2 n/ 2 0
n/ 2
1 n/ 2
2 n/ 2 [(1- 2t ) / 2]n/ 2
n/ 2
MX (t) (1- 2t)] - n/ 2, iff | 2t|1.
(1)
4
Which is the required mgf of a chi-square distribution
Note : Using binomial expansion for negative index, equation (1) can be written as
n ( 1)
n n
( 1)( n2 2)...( n2 r 1)
n n
M X (t ) 1 ( 2t ) 2 2 ( 2t ) 2 ... 2 2 ( 2t ) r .
2 2! r!
Therefore,
tr
μ′
r = coefficient of in the expansion of MX(t)
r!
n
=2r 2 ( n2 1)( n2 2)...( n2 r 1)
=n(n+2)(n+4)….(n+2r-2)
When r = 1 then
μ1′
= coefficient of ‘t’ in the expansion of MX(t)
= n = mean
When r = 2
μ′
2= coefficient of ‘t2/2’ in the expansion of MX(t)
= n(n+2)
Therefore, Variance= 2 (1 )2
= n(n+2) - n2 = 2n.
Similarly, the remaining higher order moments can be obtained.
4.ii. Limiting form of χ2 Distribution for large n(n-- >∞):
Let X~ χ2 distribution with n d.f. then the mgf
MX (t) (1- 2t)] - n/ 2, iff | 2t|1.
The MGF of standard χ2 variate say Z=(x-n)/√2n is
MZ (t) e - nt/ 2n [(1 - t n2 )] - n/ 2, iff | 2t|1.
Taking logarithm on both sides of MZ(t), we get
n
Log MZ (t) t n
2 log(1-t n2 ),
2
Since, log(1-x)= - x – x2/2- x3/3 …. ; so, we have
5
n
2 3 3/2
t 2 t 2
- t n2 t 2
n ....
2 2 n 3 n
t2
-t n
2 t n
2 O(n -1/ 2 )
2
t2
O(n -1/ 2 )
2
Where O(n-1/2) be the order of n containing n1/2 and higher powers of n in the
denominator.
t2
lim log MZ (t)
n 2
t2
⇒lim MZ (t) = e , 2
n→∞
which is the MGF of a standard normal variate. Thus by uniqueness theorem of mgf, χ2
distribution tends to normal distribution asymptotically. That is, as n-- >∞, χ2 distribution
tends to normal distribution asymptotically.
4.iii. Additive or reproductive property of χ2 distribution:
The sum of independent χ2 variates is also a chi-square variate. That is, if Xi
k
∑X i
(i=1,2,…,k) are k - independent χ2 variates with ni d.f. respectively then, the sum i =1
k
is also a chi-square variate with n = ∑ n i d.f.
i =1
Proof:
Given X~ χ2 distribution with n d.f. then by definition of mgf,
MXi (t) (1 - 2t)] - ni / 2, iff | 2t|1 and ∀ i 1,2,...,k
Then by uniqueness theorem of mgf, when Xi’s are independent, we have
k
M k (t ) = ∏ M X i (t ) = [ M X i (t )]k
∑X i i =1
i =1
(1 - 2t)- n1 / 2 (1 - 2t)- n2 / 2 ...(1 - 2t)- nk / 2
6
k
- 1
2 ∑ni
(1 - 2t) - (n1n2...nk ) / 2 (1 - 2t) i1
∑n i
Which is the mgf of a chi-square variate with n= i =1 d.f.
Note : Converse of the above result is also true. That is if Xi ‘s(i=1,2,…,k) are χ2 variates
k
∑X i
with ni d.f. respectively and if the sum i =1 is a chi square variate with
k
n = ∑ n i d.f. then Xi’s are independent.
i =1
4.iv. The theorem:
Theorem 1: Independence of a sample mean and a variance in random sampling
from a normal population.
Let x1, x2, …, xn be a random sample from normal population with mean μ and variance σ2,
then
σ2
i) x ~ N ( μ, n ) and
n 2
ns 2
ii) ∑( xi - x) = (n-1)S2 is a chi-square variate with (n-1)d.f. and these two
2
i 1
are independently distributed.
Proof: The joint probability differential of x1, x2, …, xn is given by
n
n
1 - 1
2 2
∑( xi i ) 2
dP( x1 .x2 ...xn ) e i 1
dx1dx2 ...dx n - ∞ xi ∞, ∀ i 1,2,..., n.
2
Consider the transformation to the variables Yi(i = 1,2,…,n)by means of a linear
orthogonal transformation Y = AX, where
7
a11 a12 . . a1n
a a 22 . . a 2 n
21
A . . . . .
. . . . .
a n1 an 2 . . a n n
In particular, a11 = a12 = …. = a1n = 1/√n , which
=> y1 = (1/√n)( x1+ x2+ …, +xn) = √n x (1)
Then, dy1 = √n d x
It can be easily shown that, the above choice of a11, a12, ….,a1n satisfies the condition of
n
orthogonality, so that, ∑a 2
ij = 1. Since, the transformation is orthogonal, we have
j =1
n n n n
∑y ∑x ∑( x
2
i
2
i i - x) 2 n x 2 ∑( xi - x) 2 y12 , from (1)
i 1 i 1 i 1 i 1
n n
=> ∑y ∑( xi - x) 2 .
2
i (2)
i 2 i 1
and
n n n n
∑( x i - ) 2 ∑( xi - x x ) 2 .∑( xi - x) 2 n( x - ) 2 ∑yi n( x - ) 2 , from(2)
2
i 1 i 1 i 1 i 2
Now |A’A|=|In|=1, and therefore the jacobian transformation J = +1. Thus the joint probability
differential of (y1, y2, …, yn)is given by
n
1
n - 1
2 2
∑yi 2
n ( x - )2
dG ( y1. y2 ... yn ) .e i2
| J | dy1.dy 2 ....dyn .
2
1 -
n ( x - )2
1
n -1 - 1
2 2
∑yi2
e 2 2
d x. e i 2
dy2 .dy3 ...dyn
n
2 2
8
Therefore, we have on simplification,
g ( y1 . y 2 . y3 .... y n ) = g ( y1 ).g ( y 2 . y3 .... y n ) => y1 and (y2,y3,….,yn) are independent
σ2
where g(y1) is the p.d.f of x ~ N ( μ, n ) and
n n
∑yi2 ∑( xi x) 2 ns 2 (n 1) S 2
i 2 i 1
n
are independently distributed. Moreover,
∑yi2 / 2 2
ns /
2 2
~ ( n1) d . f .
i 2
4.v. Applications of chi-square distribution :
chi-square distribution has large number of applications, some of them are listed below:
1. To test the significance of population variance σ2= σ02.
2. To test the goodness of fit.
3. To test the independence of attributes.
4. To test the homogeneity of independent estimates of the population variance.
5. To test the homogeneity of independent estimates of the population correlation
coefficient.
5. Summary :
We have discussed thoroughly the definition, probability curve, derivation of the pdf, MGF and
proved, some of the properties of Chi-square distribution. Also, we have listed out, some of the
applications of Chi-square distribution.