Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views4 pages

Regression Analysis

Chapter 6 focuses on regression analysis, detailing the concepts of linear regression, computation of regression coefficients, and the distinction between correlation and regression. It explains how regression analysis establishes the relationship between dependent and independent variables for prediction purposes. The chapter also introduces the least squares method for estimating regression lines and provides equations for calculating regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views4 pages

Regression Analysis

Chapter 6 focuses on regression analysis, detailing the concepts of linear regression, computation of regression coefficients, and the distinction between correlation and regression. It explains how regression analysis establishes the relationship between dependent and independent variables for prediction purposes. The chapter also introduces the least squares method for estimating regression lines and provides equations for calculating regression coefficients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

CHAPTER

6
REGRESSION ANALYSIS
OBJECTIVES

her studying the material in this chapter, you should


be able to:
n Understand the concepts of linear regression
analysis.
o Compute regression coeticients and regression lines.
o Know the properties of regression coefficients.
o Apply the regression analysis to estimate or predict the values af a dependent
variable from known values of an independent variable.
o Distinguish between correlation and regression.
6.1 INTRODUCTION

From our discussion in the previous chapter, we were able to find a way of determining whether
expressed by a
or nota relationship existed between two variables. If such a relationship can be
mathematical formula, we will then be able to use it for the purpose
of making predictions. The
strength of the relationship between the
Telability of any prediction wil, of course, depend on the
in the formula. Regression analysis attempts to establish the "nature ofrelationship"
Yatables included and thereby
variables-that is. to study the functional relationship between the variables
Vetween
or forecasting, A mathematical equation that allows us to
prediction,
Povde a mechanism for one or more other variables is called a regression
known values of
value of one variable from is called the derendent variable or
explained
equation. The variable whose value is to be predicted values of a dependent variable are
called

variable. The variables which are used to predict the analysis confined to
The regression
the study of
regression
variables. called simple
independent variables or explanatory
independent variable, is
independent variable
is
only two variables, a dependent variable and an
dependent variable
and the
analysis. When the relationship between thesimple linear regression. The
linear, the technique for prediction is called regression analysis.
linear
study of simple
to the two variables.
between
In this chapter ourselves relationship
we shallconfine determine the
linear
objective of simple linear regression is to
Quantital1ve
3emen Regress

62 fit *by eye" a straight line that


may
data. This line can then
REGRESSION back
or'moving we the given
MEANING OF
'stepping approximates
is value of Yfor aa given value
regressibackwar
on d
'regression'
6.2 word used the
term predicta

meaning of the Galton who first to


dictionary Francis
Theliteral or average value. It was Sir relationship between the heights of
be used determining aline by eye
or returming
to studied the below :
described
fathers and thasaej of!
Unfortunately.

because so many lines exist


yobjective,
1877, He which are very
concept in conclusions,
isnot data (unless the correlation is equal "y*a
+ bx
statistical interesting
some sons. given
arrived at have short for a minus 1). We therefore need to
Sons, and sons and
short fathers or
average height of their criterion for selecting aline of "best
tall the plus

() Tall
fathers have
short fathers
is more than to
the sons of average heights of their fathes. used criterion is the least
e s t a b l i s h a

height of than the


( ) The
average fathers is less Afrequently
(i) The
average height of
the sons of tall
that the off springs
ofabnormally tall or short fathers. squares
criterion. According to the least
the line of "best fit" is the
which parents
revealed a phenomenon criterion,
Galton's studies the population,
words, height of
squares

the sum of the squares of


In other
tend to revert or
step back to the
average
Regressionthus implies
going back or returning Galtan that
minimizes the
mediocrity". to towards distances from the observed points to
described as 'regressionto regression as a
statistical techniquepredict one variable vertical

used the term height of parents).


the average. Galton
the line.
variable (the
height of children)
from another
Thusifaline
of best fit approximating the given
statistics has a much wider perspective
But today the word regression as used in
general sense, means the estimation 1or
without any data has the
equation

reference to biometry.
Regression analysis, in the
values of one or more other
prediction Y = a t bX,
Fig. 6.1
of the unknown value of
one variable from the known
variables.
variable
The hen the method of least squares requires that
dependent variable or explained
variable whose value is to be predicted is called the a dependent variable are called e must determnine constants a and bso as to
variables which are used to predict the values of independen
by Galton, mentioned above. the heisl
minimize

variables or explanaton variables. In the study done vark S = (Y, -a - bX, +(Y, -a- bX, +.... (Y,- a- bX,}
the height of children was the dependent
the parents was the independent variable and DË+D} +..... . D?
definitions of the term regression.
In the following we shall give some important
ofthe average relationship between two or more variables in termt where D,= Y,- a-bX, represents the vertical deviation ofthe ith observed point from the line of
1. Regression is the measure
data. - M.M. Blair best fit, as indicated in Fig.6.1.The determination of aand bso as to minimize Scan be accomplished
of the original units of the by means of differential calculus. We omit the proof and state the final two equations which are
techniques in economics and business research, to find a
2. One of the most frequently used Used to determine the values of a and b. These equations, known as the normal equations for
relation beween two or more variables that are related causally, is regression analysis. estimating a and b, are given by
Taro Yamane .... ()
Y = na +b SX
3. The tem 'regression analysis 'refers to the methods by which estimates are made of the values XY = a )X + b Ex .. (2)
of a variable from a knowledge of the values of one or more other variables and to ie Solving Eqs. (1) and (2) simultaneously for a and b, we obtain
measurement of the errors involved in this estimation process. -Morris Hamburg

Regression analysis atlempts to between variables -


establish the nature of the relationship EXY-2X)(Y)
that is, to study the functional relationship between the variables and thereby provide 4 b= and
mechanism for prediction or forecasting. Ya Lun-Chou
(Ex
6.3 LINES OF REGRESSION-THE LEAST Hence the line of best fit observations (X,, Y,),. (X,, Y,).... a, Y)is
SQUARES approximating the npairs of . (3)
In this section we consider the problem of estimating or APPROACH
predicting the values
of a dependen! Y = a+ bX,
variable from known values of an independent variable. To make such a prediction, Supposethal
we have a bivariate data that consists of (EX)\EY) n XY-(2XXSY) . (4)
two quantitative variables X npairs of observations (X, Y), EXY -
and Y. We assume that Xand (X, Iz wolaled where b= nx?-(x*
that the data points follow Yare approximately linearly
closely a straight line on a Onthis busió
scatter diagram (See Fig. 6.1).
Quantitative lechl Regression Al
agemen be remarked that there are always two
64 line of Yoon Xis used to
l. It
may lines of one ofY on Xand 6.7 regression,
y- bX squares line of
regression (S
Remark
estimate
The regression or predict the
denoted by byy It of Yon X. the
least
a=
Equation(3) is
called the other Yon
ie.. Yis a
dependent variable and Xis an value of Yfrom known
given by coefficient of
Y on X and
is
the line of regression of Yon Xis the line of variable. According to the least
independent
slope of themeasures
ofK
best fit
The line of the values
the
regression
Thus byy represents c r i t e r i o n ,

"best fit" the sense


b is called changein X. line of the sum of
the squares of the e vertical distances from that it
The constant to a unit
squares

corresponding the
Fig 61) However, if we want to estimate or predict the value of observed
Xfrom known
points values
to the oflineY, (see
we
Y
m i n i m i z e s

the change by
regression of Yon
X and is given
EXY-2X)(E) willuse regression line of Xon Y which is the line of "best fit" in the sense that it minimizes the
n2 XY-(Ex)(EY) sumofthe squares of the horizontal distances from the observed points to the line (see Fig. 6.2).
nx?-(x)? (6) equations are not reversible
T h e t w or e g r e s s i o n
because of the simple reason that
for deriving these equations are quite different. However,
the basis and
shows that the line of regression of Y it
value of "a" clearly case of perfect correlation (positive or negative), the two regression lines maywould
be remarked
coincide. that in
assumptions

the on X
The formula (5)
estimating equation of the line of regression of Yon X
point (X, Y) and hence the can
passes through the
X=c+
dY

also be written as
.. ()
X On the other hand, if
to estimate a value of
Yfor a given value of we
This equation is then used
Xfor a given value of Y, we have to obtain the regression line of Xon X-c-dY
wish to estimate a value of
Y:

according to the least squares criterion


The ho
where the constants c and d are detemined
and d are given by
normal equations for estimating c
X =nc + d )Y X
EXY = cY + dy? Fig. 6.2
Solving these normal equations simultaneously for c and d, we obtain Remark 2. Since the two lines of regression pass through the point (X, Y), the mean values
(EX)\EY) (X,Y) can be obtained as the point of intersection of the two regression lines.
d =
n XY-(EX(2) ..8)
Example 1. Calculate the regression coefficients from the following information:
nzy?-(Er
n
2X =50, Y= 30, XY = 1000, Sx?= 3000, y= 1800, n= 10
c= X-dø ...9)
The constant d is called the regression coefficient of Xon Yand is denoted by by It measures
Solution. Regression coefficient of Xon Y:
the change in Xcoresponding to a unit change in Y. Clearly, 1 represents the slop
regression line of Xon Y. Further, the Formula (9) estimating the value of "c" clearly shows byy = ng XY-(2X\En10(1000) -(50)(30)
the line of regression of Xon Ypasses through the point (X, )and hence
the equation ol nzy?-(2n 10(1800) -(30)
of regression of Xon Ycan be written as
1000-150 850 =0.497
bxy 1800 -90 1710
Mgression coefficient of Yon X :
or,
x-} - b, (Y -Y), (10)

byy = XY-(x)\(2Y) 10(1000) -(50)(30)


where EXY-2X)(EY) nEx?-(E x) 10(3000)-(50)
by nXY -(2X)(EY) (11)
Ey2-(EYj?
n ngy?-(EY" 1000 -150
3000-250
850
2750
= 0.309.
RegresslON AndII

&Operations Research in
6.6
Quantitative
Techniques

Y the
Managemen 28
equation oftwo
variablesXand
following results =

7
= 4 and
estimation of regressjon 35
Example 2. In the n 7
were obtained : XY = 3900, N= 10
=6360, Y =2860, n XY -(E X)(EY) 7x151- 28 ×35
Y= 700, 5X2 [IP Uni%. BBA 2011
2X= 900,
Obtain two regression
equations. (Modifed) byx nx?-(x)? 7x140 -(28) 151-140
140 -112
11
039
Solution. We have (10x 3900) - (900 x 700) n XY (X)(EY)
- 7x151-28 x35 151-140 l1
NEXY-(EX)(E)
byy ny-(EY)?
byy NE X²-(x)
10 x6360-(900) 7x191-(35) 191-175 16 0.69

39000 -630000
-591000
= 0.792 Regression Equation of Yon Xis : Regression Equation of Xon Yis:
63600 -810000 -746400 Y-ø =byy (X- X) X-X - by (Y -Y)
(10 x 3900)-(900 x 700) Y-5 = 0.39 (X - 4)
NE XY - (X)(2) 10x2860-(700)
or,
Y= 3.44 + 0.39X
X-4 = 0.69 (Y- S)
or,

by NSY?-(Y) i.e., i.e.,


X =0.55 + 0.69Y
Example 4,
Thé TolloWing table gives the age of cars of a certain
39000-630000 -591000 make and annual maintenance
= 1.28 COsts. Obtain the regression equation for costs related to age:
28600 - 490000 461400
years) 2
700 Age ofCars (in 6
900 = 70 Maintenance cost (in hundred) 10 20
= 90 and N 10 25 30
N 10
RegressionEquation ofX on Yis : Also estimate the annual maintenance cost for a ten year old car.
Regression Equation of YonXis : Calution. Let X denote the age of car and Ydenote its annual maintenancc cost. Then it is required
Y-ø - by (X- X) X-X = byy (Y-) to find the regression equation of Yon X.
Y- 70= 0.792 (X- 90) or, X-90 = 1.28 (Y- 70)
Or,
Y- 70 = 0.792X -71.28 i.e., X-90 = 1.28Y- 89.6 Calculation for Regression Equation of Yon X
i.e.,
Y= 0.792X- 1.28 X= 1.28 Y +0.4 X X AY
100
Example 3. Find the two lines of regression on the basis of the following data : 10 4
4 20 16 400 80
3 4 6
6 25 36 625 150
Y 4 7 6 5 6
8 900 240
Solution. Calculation for Regression Lines 30 64
XY =490
XY }X=20 XY-85 Sx?=120 Ey'- 2025
2 4
2
8
£X 20 = 5 and
2Y85 =21.25
2 4 4 n 4
4 16 n
3 7 21 1960 -1700 260
9 49
n XY -(x)(Y) 4x490 -(20)(85) = 3.25
4 6 24 byy = = 480 -400
- 80
16 36 4x 120 -(20)
25
25 nX-(X)
Regression Equation of Yon Xis given by
6 6 (X- 5)
36 36 Y- 21.25 = 3.25
7 5
49 25
35
l.e.
Y-ø -byy (X - X) Or.
Y=5 +3.25 X
... (1)

X=28 Y- 21.25 = 3.25X- 16.25 car is :


SY=35 2XY =151 cost for a ten-year old
Sx2 =140 Ly?= 191 Substituting X= 10 in (1), the estimated
= 37.5
annual maintenance
hundred or 3750.
Y= 5+3.25 x 10

You might also like