Chapter 6
Regression Method of Estimation
The ratio method of estimation uses the auxiliary information which is correlated with the study
variable to improve the precision which results in the improved estimators when the regression of Y on
X is linear and passes through origin. When the regression of Y on X is linear, it is not necessary that
the line should always pass through origin. Under such conditions, it is more appropriate to use the
regression type estimator to estimate the population means.
In ratio method, the conventional estimator sample mean y was improved by multiplying it by a a
X
factor where x is an unbiased estimator of population mean X which is chosen as population
x
mean of auxiliary variable. Now we consider another idea based on difference.
Consider an estimator ( x X ) for which E ( x X ) 0.
Consider an improved estimator of Y as
Yˆ * y ( x X )
which is an unbiased estimator of Y and is any constant. Now find such that the Var (Yˆ * ) is
minimum
Var (Yˆ *) Var ( y ) 2 Var ( x ) 2 Cov( x , y )
Var (Y * )
0
Cov( x , y )
Var ( x )
N n
S XY
Nn
N n 2
SX
Nn
S
XY2
SX
1 N 1 N
where S XY
N 1 i 1
( X i X )(Yi Y ), S X2 ( X i X ).
N 1 i 1
Consider a linear regression model y x e where y is the dependent variable, x is the independent
variable and e is the random error component which takes care of the difference arising due to lack of
exact relationship between x and y.
1
Note that the value of regression coefficient in a linear regression model y x e of y on x
n
Cov( x, y ) S xy
obtained by minimizing e
i 1
2
i based on n data sets ( xi , yi ), i 1, 2,.., n is
Var ( x)
2 . Thus
Sx
the optimum value of is same as the regression coefficient of y on x with a negative sign, i.e.,
.
So the estimator Yˆ * with optimum value of is
Yˆreg y ( X x )
which is the regression estimator of Y and the procedure of estimation is called as the regression
method of estimation.
The variance of Yˆreg is
Var (Yˆreg ) V ( y )[1 2 ( x , y )]
where ( x , y ) is the correlation coefficient between x and y . So Yˆreg would be efficient if x and y
are highly correlated. The estimator Yˆreg is more efficient than Y if ( x , y ) 0 which generally
holds.
Regression estimates with preassigned :
If value of is known as 0 (say), then the regression estimator is
Yˆreg y 0 ( X x ) .
Bias of Yˆreg :
Now, assuming that the random sample ( xi , yi ), i 1, 2,.., n is drawn by SRSWOR,
E (Yˆreg ) E ( y ) 0 X E ( x )
Y 0 X X
Y
Thus Yˆreg is an unbiased estimator of Y when is known.
2
Variance of Yˆreg
2
Var (Yˆreg ) E Yˆreg E (Yˆreg )
2
E y 0 ( X x ) Y
2
E ( y Y ) 0 ( x X )
E ( y Y ) 2 02 ( x X ) 2 0 E ( x X )( y Y )
Var ( y ) 02Var ( x ) 2 0Cov( x , y )
f
SY2 02 S X2 2 0 S XY
n
f
SY2 02 S X2 2 0 S X SY
n
where
N n
f
N
1 N
S X2
N 1 i 1
( X i X )2
1 N
SY2
N 1 i 1
(Yi Y ) 2
: Correlation coefficient between X and Y .
Comparing Var (Yˆreg ) with Var ( y ) , we note that
Var (Yˆreg ) Var ( y )
if 02 S X2 20 S XY 0
2 S XY
or 0 S X2 0 0
S X2
which is possible when
2S 2S
either 0 0 and 0 2XY 0 2XY 0 0 .
SX SX
2S 2S
or 0 0 and 0 2XY ) 0 0 0 2XY .
SX SX
3
Optimal value of
Choose such that Var (Yˆreg ) is minimum .
So
Var (Yˆreg )
SY2 2 S X2 2 S X SY 0
S S
Y XY2 .
SX SX
S
The minimum value of variance of Yˆreg with optimum value of opt Y is
SX
f S2 S
Varmin (Yˆreg ) SY2 2 Y2 S X2 2 Y S X SY
n SX SX
f
SY2 (1 2 ).
n
Since 1 1, so
Var (Yˆreg ) VarSRS ( y )
which always holds true. So the regression estimator is always better than the sample mean under
SRSWOR.
Departure from :
If 0 is the preassigned value of regression coefficient, then
f
Varmin (Yˆreg ) SY2 02 S X2 2 0 S X SY
n
f
SY2 02 S X2 2 0 S X SY 2 SY2 2 SY2
n
f
(1 2 ) SY2 02 S X2 2 0 S X2 opt opt
2
S X2
n
f
(1 2 ) SY2 ( 0 opt ) 2 S X2
n
SY
where opt .
SX
4
Estimate of variance
An unbiased sample estimate of Var (Yˆreg ) is
n 2
f
(Yˆ )
Var reg
n(n 1) i 1
( yi y ) 0 ( xi x )
n
f
n
(s
i 1
2
y 02 sx2 2 0 sxy ).
Note that the variance of Yˆreg increases as the difference between 0 and opt increases.
Regression estimates when is computed from sample
Suppose a random sample of size n on paired observations on ( xi , yi ), i 1, 2,.., n is drawn by
SRSWOR. When is unknown, it is estimated as
sxy
ˆ
sx2
and then the regression estimator of Y is given by
Yˆreeg y ˆ ( X x ).
It is difficult to find the exact expressions of E (Yreg ) and Var (Yˆreg ). So we approximate them using
the same methodology as in the case of ratio method of estimation.
Let
y Y
0 y Y (1 0 )
Y
xX
1 x X (1 1 )
x
s S XY
2 xy sxy S XY (1 2 )
S XY
sx2 S X2
3 sx2 S X2 (1 3 )
S X2
Then
E ( 0 ) 0, E (1 ) 0,
E ( 2 ) 0, E ( 3 ) 0,
f 2
E ( 02 ) CY ,
n
f
E (12 ) C X2 ,
n
f
E ( 0 1 ) C X CY
n
and
5
sxy
Yreg y (X x)
sx2
S XY (1 2 )
Y (1 0 ) (1 X )
S x2 (1 3 )
The estimation error of Yˆreg is
(Yˆreg Y ) Y 0 X 1 (1 2 )(1 3 )1
S XY
where is the population regression coefficient.
S X2
Assuming 3 1,
(Yˆreg Y ) Y 0 X (1 1 2 )(1 3 32 ....)
Retaining the terms upto second power of ' s and ignoring other terms, we have
(Yˆreg Y ) Y 0 X (1 1 2 )(1 3 32 )
Y 0 X (1 1 3 1 2 )
Bias of Yˆreg
Now the bias of Yˆreg upto the second order of approximation is
E (Yˆreg Y ) E Y 0 X 1 ( 1 1 2 )(1 3 32 )
Xf 21
302
n XS XY XS X
N n
where f and (r , s)th cross product moment is given by
N
rs E ( x X ) r ( y Y ) s
So that
21 E ( x X ) 2 ( y Y )
30 E ( x X )3 .
Thus
f 21 30
E (Yˆreg ) 2 .
n S XY S X
6
Also,
E (Yˆreg ) E ( y ) E[ ˆ ( X x )]
Y XE ( ˆ ) E ( ˆ x )
Y E ( x ) E ( ˆ ) E ( ˆ x )
Y Cov( ˆ , x )
Bias (Yˆreg ) E (Yˆreg ) Y Cov ( ˆ , x )
MSE of Yˆreg
To obtain the MSE of Yˆreg , consider
E (Yˆreg Y ) 2 E 0Y X ( 1 1 3 1 2 )
2
Retaining the terms of ' s upto the second power second and ignoring others, we have
E (Yˆreg Y ) 2 E 02Y 2 2 X 212 2 XY 0 1
Y 2 E ( 02 ) 2 X 2 E (12 ) 2 XYE ( 01 )
f 2 SY2 2 SX
2
S S
Y 2
2
X 2
2 XY X Y
n Y X XY
MSE (Yˆreg ) E (Yˆreg Y ) 2
f 2
( SY 2 S X2 2 S X SY )
n
S XY S
Since 2 Y ,
SX SX
so substituting it in MSE (Yˆreg ), we get
f
MSE (Yˆreg) SY2 (1 2 ).
n
So upto second order of approximation, the regression estimator is better than the conventional sample
mean estimator under SRSWOR. This is because the regression estimator uses some extra information
also. Moreover, such extra information requires some extra cost also. This shows a false superiority in
some sense. So the regression estimators and SRS estimates can be combined if cost aspect is also
taken into consideration.
7
Comparison of Yˆreg with ratio estimate and SRS sample mean estimate
f
MSE (Yˆreg ) SY2 (1 2 )
n
f
MSE (YˆR ) ( SY2 R 2 S X2 2 RS X SY )
n
f
VarSRS ( y ) SY2 .
n
(i) As MSE (Yˆreg ) VarSRS ( y )(1 2 ) and because 2 1, so Yˆreg is always superior to y .
(ii) Yˆreg is better than YˆR if MSE (Yˆreg ) MSE (YˆR )
f 2 f
or if SY (1 2 ) ( SY2 R 2 S X2 2 RS X SY )
n n
or if ( RS X SY ) 0
2
which always holds true.
So regression estimate is always superior to the ratio estimate upto the second order of
approximation.
Regression estimates in stratified sampling
Under the set up of stratified sampling, let the population of N sampling units be divided into k
k
strata. The strata sizes are N1 , N2 ,.., Nk such that N
i 1
i N. A sample of size ni on
( xij , yij ), j 1, 2,.., ni , is drawn from ith strata (i = 1,2,..,k) by SRSWOR where xij and yij denote
the jth unit from ith strata on auxiliary and study variables, respectively.
In order to estimate the population mean, there are two approaches.
1. Separate regression estimator
Estimate regression estimator
Yˆreg y 0 ( X x )
from each stratum separately, i.e., the regression estimate in the ith stratum is
Yˆreg (i ) yi i ( X i xi ).
Find the stratified mean as the weighted mean of Yˆreg (i ) i 1, 2,.., k as
8
k N Yˆ
Ysreg i reg (i )
ˆ
i 1 N
k
[ wi { yi i ( X i xi )}]
i 1
Sixy Ni
where i 2
, wi .
S ix N
In this approach , the regression estimator is separately obtained in each of the stratum and then
combined using the philosophy of stratified sample. So Yˆsreg is termed as separate regression
estimator,
2. Combined regression estimator
Another strategy is to estimate x and y in the Yˆreg as respective stratified mean. Replacing x
k k
by xst wi xi and y by yst wi yi , we have
i 1 i 1
Yˆcreg yst ( X xst ).
In this case, all the sample information is combined first and then implemented in regression
estimator, so Yˆreg is termed as combined regression estimator.
Properties of separate and combined regression
In order to derive the mean and variance of Yˆsreg and Yˆcreg , there are two cases
- when is preassigned as 0
- when is estimated from the sample.
s
We consider here the case that is preassigned as 0 . Other case when is estimated as ˆ xy2
sx
can be dealt with the same approach based on defining various ' s and using the approximation theory
as in the case of Yˆreg .
9
1. Separate regression estimator
Assume is known, say 0 . Then
k
Yˆs reg wi [ yi 0i ( X i xi )]
i 1
k
E (Yˆs reg ) wi E ( yi ) 0i X i E ( xi )
i 1
k
wi [Yi ( X i X i )]
i 1
Y.
2
Var (Yˆs reg ) E Yˆs reg E (Yˆs reg )
2
k k
E wi yi i wi 0i ( X i xi ) Y
i 1 i 1
2
k k
E wi ( yi Y ) wi 0i ( xi X i )
i 1 i 1
k k k
wi2 E ( yi Yi ) 2 wi2 02i E ( xi X i )]2 2 wi2 0i E ( xi X i )( yi Yi )
i 1 i 1 i 1
k k k
wi2Var ( yi ) wi2 02iVar ( xi ) 2 wi2 0i Cov( xi , yi )
i 1 i 1 i 1
k 2
w f
( SiY2 02i SiX2 2 0i SiXY )]
i i
i 1 ni
S
Var (Yˆs reg ) is minimum when 0i iXY and so substituting 0i , we have
SiX2
k
w2 f
Vmin (Yˆs reg ) i i ( SiY2 02i SiX2 )
i 1 ni
N n
where f i i i .
Ni
Since SRSWOR is followed in drawing the samples from each stratum, so
E ( six2 ) SiX2
E ( siy2 ) SiY2
E ( sixy ) SiXY
Thus an unbiased estimator of variance can be obtained by replacing SiX2 and SiY2 by their respective
unbiased estimators six2 and siy2 , respectively as
10
(Yˆ ) wi fi ( s 2 2 s 2 2 s )
k 2
Var s reg
i 1 ni
iy oi ix 0 i ixy
and
min (Yˆ ) wi fi ( s 2 2 s 2 )
k 2
Var s reg
i 1 ni
iy oi ix
2. Combined regression estimator:
Assume is known as 0 . Then
k k
Yˆc reg wi yi 0 ( X wi xi )
i 1 i 1
k k
E Yˆc reg wi E ( yi ) 0 [ X wi E ( xi )]
i 1 i 1
k k
wY
i i 0 [ X wi X i ]
i 1 i 1
Y 0 ( X X )
Y.
Thus Yˆc reg is an unbiased estimator of Y .
Var (Yˆc reg ) E[Yc reg E (Yc reg )]2
k k
E[ wi yi 0 ( X wi xi ) Y ]2
i 1 i 1
k k
E[ wi ( yi Y ) 0 wi ( xi X i )]2
i 1 i 1
k k k
wi2Var ( yi ) 02 wi2Var ( xi ) 2 wi2 0Cov( xi , yi )
i 1 i 1 i 1
k 2
w f
SiY2 02 SiX2 2 0 SiXY .
i i
i 1 ni
Var (Yˆc reg ) is minimum when
Cov( xst , yst )
0
Var ( xst )
k
wi2 fi
i 1 ni
SiXY
k 2
wi fi 2
i 1 ni
SiX
and the minimum variance is given by
k 2
w f
Varmin (Yˆc reg ) i i ( SiY2 02 SiX2 ).
i 1 ni
11
Since SRSWOR is followed to draw the sample from strata, so using E six2 Six2 , E siy2 Siy2 and
E sixy SiXY , we get the estimate of variance as
(Yˆ ) wi f i ( s 2 2 s 2 2 s )
k 2
Var c reg
i 1 ni
iy o ix 0 i ixy
and
min (Yˆ ) wi fi ( s 2 2 s 2 )
k 2
Var c reg
i 1 ni
iy oi ix
Comparison of Yˆs reg and Yˆc reg :
The variance of Yˆs reg is minimum when 0i 0 for all i.
Cov( xst , yst )
The variance of Yˆc reg is minimum when 0 0* .
Var ( xst )
Cov( xst , yst )
The minimum variance is Var (Yˆc reg ) min Var ( yst )(1 *2 ) where * .
Var ( xst )Var ( yst )
k 2
w f
Var (Yˆc reg ) Var (Yˆs reg ) ( 02i 02 ) i i SiX2
i 1 ni
k
fi
Var (Yˆc reg ) min Var (Yˆs reg ) ( 0i 0 ) 2 wi2 SiX2
0 i 0
i 1 ni
0
which is always true.
So if the regression line of y on x is approximately linear and the regression coefficients do not vary
much among the strata, then separate regression estimate is more efficient than combined regression
estimator.
12