STAC67H3: Regression Analysis
Fall, 2014
Instructor: Jabed Tomal
Department of Computer and Mathematical Sciences
University of Toronto Scarborough
Toronto, ON
Canada
December 1, 2014
Jabed Tomal (U of T) Regression Analysis December 1, 2014 1/7
Logistic Regression
Binary Response Variable
1 In a study of coronary heart disease as a function of age, gender,
smoking history, cholesterol level, percent of ideal body weight,
and blood pressure, the response Y is defined to have two
possible outcomes: person developed heart disease, person did
not develop heart disease. The outcome is binary which may be
coded as 1 (developed heart disease) and 0 (did not develop
heart disease).
Jabed Tomal (U of T) Regression Analysis December 1, 2014 2/7
Logistic Regression
Binary Response Variable
1 The response Yi is Bernoulli with probability distribution
πi if y = 1
P(Yi = y ) =
(1 − πi ) if y = 0
2 The expectation of Yi is
E(Yi ) = πi = (?)β0 + β1 Xi
Jabed Tomal (U of T) Regression Analysis December 1, 2014 3/7
Logistic Regression
Binary Response Variable
1 The regression model is obtained by assuming Yi independent
Bernoulli with expected value πi which is a non-linear function of
the explanatory variable X with the following form
exp{β0 + β1 Xi }
πi = .
1 + exp{β0 + β1 Xi }
2 which gives
1
(1 − πi ) = .
1 + exp{β0 + β1 Xi }
Jabed Tomal (U of T) Regression Analysis December 1, 2014 4/7
Logistic Regression
Likelihood Function
1 The probability function for the ith observation Yi is
P(Yi |β0 , β1 ) = πiYi (1 − πi )1−Yi
2 Since the observations are assumed independent, the joint
probability function is obtained as
n n
πiYi (1 − πi )1−Yi
Y Y
P(Y1 , Y2 , · · · , Yn |β0 , β1 ) = P(Yi |β0 , β1 ) =
i=1 i=1
Jabed Tomal (U of T) Regression Analysis December 1, 2014 5/7
Logistic Regression
Likelihood Function
1 The likelihood function is obtained by considering the joint
probability function as a function of β0 and β1 given the
observations Y1 , Y2 , · · · , Yn as following
n
πiYi (1 − πi )1−Yi
Y
L(β0 , β1 |Y1 , Y2 , · · · , Yn ) =
i=1
Jabed Tomal (U of T) Regression Analysis December 1, 2014 6/7
Logistic Regression
Likelihood Function
1 The log likelihood function is obtained by taking log on both sides
of the likelihood function
n
X
log L(β0 , β1 |Y1 , Y2 , · · · , Yn ) = [Yi log πi + (1 − Yi ) log(1 − πi )]
i=1
where πi is a function of β0 and β1 .
2 The estimators of the parameters β0 and β1 are obtained by
maximizing the log-likelihood function with respect to β0 and β1 ,
respectively. The estimation of the parameters require iterative
procedures and a computer software.
Jabed Tomal (U of T) Regression Analysis December 1, 2014 7/7