0% found this document useful (0 votes)

5 views64 pages

Lecture 2

Uploaded by

Sanjana Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views64 pages

Lecture 2

Uploaded by

Sanjana Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Introduction to Linear Regression

September 7, 2017

Regression model
 Relation between variables where changes in some variables
may “explain” or possibly “cause” changes in other variables.
 Explanatory variables are termed the independent variables
and the variables to be explained are termed the dependent
variables.
 Regression model estimates the nature of the relationship
between the independent and dependent variables.
4-1
- Change in dependent variables that results from changes in
independent variables.
- Strength of the relationship.
- Statistical significance of the relationship.

Empirical problem: Class size and educational output

 Policy question: What is the effect of reducing class
size by one student per class? by 8 students/class?
 What is the right output (performance) measure?
 parent satisfaction
 student personal development
 future adult welfare
 future adult earnings
 performance on standardized tests
4-2
What do data say about class sizes and test scores?

The California Test Score Data Set

Variables:
 5th grade test scores (Stanford-9 achievement test,
combined math and reading), district average
 Student-teacher ratio (STR) = no. of students in the
district divided by no. full-time equivalent teachers

4-3
An initial look at the California test score data:

4-4
Do districts with smaller classes (lower STR) have higher test
scores?

4-5
The class size/test score policy question:
 What is the effect on test scores of reducing STR by
one student/class?
Test score
 Object of policy interest:
STR
 This is the slope of the line relating test score and STR

4-6
This suggests that we want to draw a line through the
Test Score v. STR scatterplot – but how?

4-7
Notation and Terminology

The population regression line:

Test Score = 0 + 1STR

1 = slope of population regression line

Test score
=
STR
= change in test score for a unit change in STR
 Why are 0 and 1 “population” parameters?
 We would like to know the population value of 1.
 We don’t know 1, so must estimate it using data.
4-8
How can we estimate 0 and 1 from data?
Recall that Y was the least squares estimator of Y: Y
solves,
n
min m  (Yi  m) 2
i 1

By analogy, we will focus on the least squares

(“ordinary least squares” or “OLS”) estimator of the
unknown parameters 0 and 1, which solves,

n
min b0 ,b1  [Yi  (b0  b1 X i )]2
i 1

4-9
n
The OLS estimator solves: min b0 ,b1  [Yi  (b0  b1 X i )]2
i 1

 The OLS estimator minimizes the average squared

difference between the actual values of Yi and the
prediction (predicted value) based on the estimated line.

4-10
Why use OLS, rather than some other estimator?
 OLS is a generalization of the sample average: if the
“line” is just an intercept (no X), then the OLS
estimator is just the sample average of Y1,…Yn (Y ).
 Like Y , the OLS estimator has some desirable
properties: under certain assumptions, it is unbiased
(that is, E( ˆ1 ) = 1)

4-11
4-12
Application to the California Test Score – Class Size data

Estimated slope = ˆ1 = – 2.28

Estimated intercept = ˆ0 = 698.9
Estimated regression line: TestScore = 698.9 – 2.28STR
4-13
Interpretation of the estimated slope and intercept
TestScore = 698.9 – 2.28STR
 Districts with one more student per teacher on average
have test scores that are 2.28 points lower.
Test score
 That is, = –2.28
STR

4-14
Predicted values & residuals:

One of the districts in the data set is Antelope, CA, for

which STR = 19.33 and Test Score = 657.8
predicted value: YˆAntelope = 698.9 – 2.2819.33 = 654.8
residual: uˆ Antelope = 657.8 – 654.8 = 3.0
4-15
The OLS regression line is an estimate, computed using
our sample of data; a different sample would have given
a different value of ˆ1 .
How can we:
 quantify the sampling uncertainty associated with ˆ1 ?
 use ˆ1 to test hypotheses such as 1 = 0?
 construct a confidence interval for 1?

Like estimation of the mean, we proceed in four steps:

1. The probability framework for linear regression
2. Estimation
3. Hypothesis Testing
4. Confidence intervals
4-16
1. Probability Framework for Linear Regression

The Population Linear Regression Model

Yi = 0 + 1Xi + ui, i = 1,…, n

 X is the independent variable/regressor

 Y is the dependent variable
 0 = intercept
 1 = slope
 ui = “error term”

4-17
 The error term consists of omitted factors, or possibly
measurement error in the measurement of Y. In
general, these omitted factors are other factors that
influence Y, other than the variable X

4-18
Ex.: The population regression line and the error term

4-19
What are some of the omitted factors in this example?

Data and sampling

The population objects (“parameters”) 0 and 1 are
unknown; so to draw inferences about these unknown
parameters we must collect relevant data.

Simple random sampling:

Choose n entities at random from the population of
interest, and observe (record) X and Y for each entity

4-20
Simple random sampling implies that {(Xi, Yi)}, i = 1,…,
n, are independently and identically distributed (i.i.d.).
(Note: (Xi, Yi) are distributed independently of (Xj, Yj) for
different observations i and j.)

4-21
The Least Squares Assumptions

1. The conditional distribution of u given X has mean

zero, that is, E(u|X = x) = 0.
2. (Xi,Yi), i =1,…,n, are i.i.d.
3. X and u have finite fourth moments, that is:
E(X4) <  and E(u4) < .

4-22
Least squares assumption #1: E(u|X = x) = 0.
For any given value of X, the mean of u is zero

4-23
Example: Assumption #1 and the class size example
Test Scorei = 0 + 1STRi + ui, ui = other factors

“Other factors:”
 parental involvement
 outside learning opportunities (extra math class,..)
 home environment conducive to reading
 family income is a useful proxy for many such factors

So E(u|X=x) = 0 means E(Family Income|STR) = constant

(which implies that family income and STR are
uncorrelated).

4-24
Least squares assumption #2:
(Xi,Yi), i = 1,…,n are i.i.d.

This arises automatically if the entity (individual, district)

is sampled by simple random sampling: the entity is
selected then, for that entity, X and Y are observed
(recorded).

The main place we will encounter non-i.i.d. sampling is

when data are recorded over time (“time series data”) –
this will introduce some extra complications.

4-25
Least squares assumption #3:
E(X4) <  and E(u4) < 

Because Yi = 0 + 1Xi + ui, assumption #3 can

equivalently be stated as, E(X4) <  and E(Y4) < .

Assumption #3 is generally plausible. A finite domain of

the data implies finite fourth moments.

4-26
1. The probability framework for linear regression
2. Estimation: the Sampling Distribution of ˆ1
3. Hypothesis Testing
4. Confidence intervals

Like Y , ˆ1 has a sampling distribution.

 What is E( ˆ1 )?
 What is var( ˆ1 )? (measure of sampling uncertainty)
 What is its sampling distribution in small samples?
 What is its sampling distribution in large samples?

4-27
The sampling distribution of ˆ1
Yi = 0 + 1Xi + ui
Y = 0 + 1 X + u
so Yi – Y = 1(Xi – X ) + (ui – u )
Thus,
n

( X i  X )(Yi  Y )
ˆ1 = i 1
n

 i
( X
i 1
 X ) 2

4-28
E( ˆ1 ) = 1

That is, ˆ1 is an unbiased estimator of 1.

4-29
p
(1) E( ˆ1 ) = 1, ˆ1  1

(2) When n is large, the sampling distribution of ˆ1 is

well approximated by a normal distribution

4-30
Large-n approximation to the distribution of ˆ1 :

Recall the summary of the sampling distribution of Y :

For (Y1,…,Yn) i.i.d. with 0 <  Y2 < ,
 The exact (finite sample) sampling distribution of Y
has mean Y (“Y is an unbiased estimator of Y”) and
variance  Y2 /n
p
 Y  Y (law of large numbers)
Y  E (Y )
 is approximately distributed N(0,1)
var(Y )

4-31
Parallel conclusions hold for the OLS estimator ˆ1 :

Under the three Least Squares Assumptions,

 The exact sampling distribution of ˆ1 has mean 1
(“ ˆ1 is an unbiased estimator of 1”), and var( ˆ1 ) is
inversely proportional to n.
p
 ˆ1  1 (law of large numbers)
ˆ1  E ( ˆ1 )
 is approximately distributed N(0,1)
var( ˆ1 )

4-32
1. The probability framework for linear regression
2. Estimation
3. Hypothesis Testing
4. Confidence intervals

Suppose a skeptic suggests that reducing the number of

students in a class has no effect on learning or,
specifically, test scores. The skeptic thus asserts the
hypothesis,
H0: 1 = 0

4-33
We wish to test this hypothesis using data – reach a
tentative conclusion whether it is correct or incorrect.

Null hypothesis and two-sided alternative:

H0: 1 = 0 vs. H1: 1  0
or, more generally,
H0: 1 = 1,0 vs. H1: 1  1,0
where 1,0 is the hypothesized value under the null.

Null hypothesis and one-sided alternative:

H0: 1 = 1,0 vs. H1: 1 < 1,0

4-34
An effect could “go either way,” so it is standard to focus
on two-sided alternatives.
Recall hypothesis testing for population mean using Y :

Y  Y ,0
t=
sY / n
then reject the null hypothesis if |t| >1.96.

4-35
Applied to a hypothesis about 1:
estimator - hypothesized value
t=
standard error of the estimator
so
ˆ1  1,0
t=
SE ( ˆ1 )

where 1 is the value of 1,0 hypothesized under the null

(for example, if the null value is zero, then 1,0 = 0)

What is SE( ˆ1 )?

SE( ˆ1 ) = the square root of an estimator of the
variance of the sampling distribution of ˆ1
4-36
The calculation of the t-statsitic:

ˆ1  1,0 ˆ1  1,0

t= =
SE ( ˆ1 ) ˆ 2ˆ
1

 Reject at 5% significance level if |t| > 1.96

Estimated regression line: TestScore = 698.9 – 2.28STR

Regression software reports the standard errors:

SE( ˆ0 ) = 10.4 SE( ˆ1 ) = 0.52

ˆ1  1,0 2.28  0
t-statistic testing 1,0 = 0 = = = –4.38
SE ( ˆ1 ) 0.52

 The 1% 2-sided significance level is 2.58, so we reject

the null at the 1% significance level.
4-38
 Alternatively, we can compute the p-value…

4-39
1. The probability framework for linear regression
2. Estimation
3. Hypothesis Testing
4. Confidence intervals

In general, if the sampling distribution of an estimator is

normal for large n, then a 95% confidence interval can be
constructed as estimator  1.96standard error.

So: a 95% confidence interval for ˆ1 is,

{ ˆ1  1.96SE( ˆ1 )}

4-40
Example: Test Scores and STR, California data
Estimated regression line: TestScore = 698.9 – 2.28STR

SE( ˆ0 ) = 10.4 SE( ˆ1 ) = 0.52

95% confidence interval for ˆ1 :

{ ˆ1  1.96SE( ˆ1 )} = {–2.28  1.960.52}

= (–3.30, –1.26)
Equivalent statements:
 The 95% confidence interval does not include zero;
 The hypothesis 1 = 0 is rejected at the 5% level

4-41
A convention for reporting estimated regressions:

Put standard errors in parentheses below the estimates

TestScore = 698.9 – 2.28STR

(10.4) (0.52)

This expression means that:

 The estimated regression line is
TestScore = 698.9 – 2.28STR
 The standard error of ˆ is 10.4
0

 The standard error of ˆ1 is 0.52

4-42
Regression when X is Binary

Sometimes a regressor is binary:

 X = 1 if female, = 0 if male
 X = 1 if treated (experimental drug), = 0 if not
 X = 1 if small class size, = 0 if not

So far, 1 has been called a “slope,” but that doesn’t

make much sense if X is binary.

4-43
How do we interpret regression with a binary regressor?

Yi = 0 + 1Xi + ui, where X is binary (Xi = 0 or 1):

 When Xi = 0: Y i = 0 + u i
 When Xi = 1: Y i = 0 +  1 + u i
thus:
 When Xi = 0, the mean of Yi is 0
 When Xi = 1, the mean of Yi is 0 + 1
that is:
 E(Yi|Xi=0) = 0
4-44
 E(Yi|Xi=1) = 0 + 1
so:
1 = E(Yi|Xi=1) – E(Yi|Xi=0)
= population difference in group means
Example: TestScore and STR, California data
Let
1 if STRi  20
Di = 
0 if STRi  20

The OLS estimate of the regression line relating

TestScore to D (with standard errors in parentheses) is:

TestScore = 650.0 + 7.4D

4-45
(1.3) (1.8)

Difference in means between groups = 7.4;

SE = 1.8 t = 7.4/1.8 = 4.0

4-46
Compare the regression results with the group means,
computed directly:
Class Size Average score (Y ) Std. dev. (sY) N
Small (STR > 20) 657.4 19.4 238
Large (STR ≥ 20) 650.0 17.9 182

Estimation: Ysmall  Ylarge = 657.4 – 650.0 = 7.4

Ys  Yl 7.4
Test =0: t  = 4.05
SE (Ys  Yl ) 1.83
95% confidence interval ={7.41.961.83}=(3.8,11.0)
This is the same as in the regression!
TestScore = 650.0 + 7.4D
(1.3) (1.8)

4-47
Summary: regression when Xi is binary (0/1)

Yi = 0 + 1Xi + ui

 0 = mean of Y given that X = 0

 0 + 1 = mean of Y given that X = 1
 1 = difference in group means, X =1 minus X = 0
 SE( ˆ1 ) has the usual interpretation
 t-statistics, confidence intervals constructed as usual
 This is another way to do difference-in-means
analysis

4-48
 Other Regression Statistics

A natural question is how well the regression line “fits”

or explains the data. There are two regression statistics
that provide complementary measures of the quality of
fit:
 The regression R2 measures the fraction of the
variance of Y that is explained by X; it is unitless and
ranges between zero (no fit) and one (perfect fit)
 The standard error of the regression measures the fit
– the typical size of a regression residual – in the units
of Y.

4-49
The R2
Write Yi as the sum of the OLS prediction + OLS
residual:

Yi = Yˆi + uˆi

The R2 is the fraction of the sample variance of Yi

“explained” by the regression, that is, by Yˆi :

ESS
2
R = ,
TSS
n n
where ESS =  (Yˆi  Yˆ ) and TSS =
i 1
2
 i
(Y
i 1
 Y ) 2
.

4-50
n n
ESS
2
R =
TSS
, where ESS =  (Yˆi  Yˆ ) and TSS =
i 1
2
 i
(Y
i 1
 Y ) 2

The R2:
 R2 = 0 means ESS = 0, so X explains none of the
variation of Y
 R2 = 1 means ESS = TSS, so Y = Yˆ so X explains all of
the variation of Y
 0 ≤ R2 ≤ 1
 For regression with a single regressor (the case here),
R2 is the square of the correlation coefficient between
X and Y

4-51
The Standard Error of the Regression (SER)

The standard error of the regression is (almost) the

sample standard deviation of the OLS residuals:

1 n
SER = 
n  2 i 1
( ˆ
ui  ˆ
ui ) 2

1 n 2
= 
n  2 i 1
uˆi

4-52
1 n 2
SER = 
n  2 i 1
uˆi

The SER:
 has the units of u, which are the units of Y
 measures the spread of the distribution of u
 measures the average “size” of the OLS residual (the
average “mistake” made by the OLS regression line)
 The root mean squared error (RMSE) is closely
related to the SER:
1 n 2
RMSE = 
n i 1
uˆi

This measures the same thing as the SER

4-53
TestScore = 698.9 – 2.28STR, R2 = .05, SER = 18.6
(10.4) (0.52)

4-54
The slope coefficient is statistically significant and large
in a policy sense, even though STR explains only a small
fraction of the variation in test scores.

Heteroskedasticity and Homoskedasticity

 What do these two terms mean?

 Consequences of homoskedasticity
 Implication for computing standard errors

4-55
What do these two terms mean?
If var(u|X=x) is constant – that is, the variance of the
conditional distribution of u given X does not depend on
X, then u is said to be homoskedastic. Otherwise, u is
said to be heteroskedastic.

4-56
Homoskedasticity in a picture:

 E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)

 The variance of u does not change with (depend on) x
4-57
Heteroskedasticity in a picture:

 E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)

 The variance of u depends on x – so u is
heteroskedastic.
4-58
An real-world example of heteroskedasticity
: Average hourly earnings vs. years of education (data
source: 1999 Current Population Survey)
Average Hourly Earnings Fitted values

60
Average hourly earnings

0
5 10 15 20
Years of Education
Scatterplot and OLS Regression Line

4-59
Is heteroskedasticity present in the class size data?

Hard to say…looks nearly homoskedastic, but the spread

might be tighter for large values of STR.

4-60
So far we have (without saying so) assumed that u is
heteroskedastic:

Recall the three least squares assumptions:

1. The conditional distribution of u given X has mean
zero, that is, E(u|X = x) = 0.
2. (Xi,Yi), i =1,…,n, are i.i.d.
3. X and u have four finite moments.

Heteroskedasticity and homoskedasticity concern

var(u|X=x). Because we have not explicitly assumed
homoskedastic errors, we have implicitly allowed for
heteroskedasticity.
4-61
What if the errors are in fact homoskedastic?:
 the OLS is the estimator with the lowest variance
among all estimators that are linear functions of
(Y1,…,Yn)

4-62
Homoskedasticity-only standard errors are the
default setting in regression software – sometimes the
only setting (e.g. Excel). To get the general
“heteroskedasticity-robust” standard errors you must
override the default.

If you don’t override the default and there is in fact

heteroskedasticity, you will get the wrong standard errors
(and wrong t-statistics and confidence intervals).

4-63
The critical points:
 If the errors are homoskedastic and you use the
heteroskedastic formula for standard errors. You are
OK.
 If the errors are heteroskedastic and you use the
homoskedasticity-only formula for standard errors,
the standard errors are wrong.
 The two formulas coincide (when n is large) in the
special case of homoskedasticity
 The bottom line: you should always use the
heteroskedasticity-based formulas – these are
conventionally called the heteroskedasticity-robust
standard errors.
4-64

Introduction To Econometrics - Stock & Watson - CH 4 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 4 Slides
84 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Stock3e Empirical SM PDF
No ratings yet
Stock3e Empirical SM PDF
1 page
Temas 4 Al 7
No ratings yet
Temas 4 Al 7
191 pages
Stock and Watson - Slides For Chapter 4
No ratings yet
Stock and Watson - Slides For Chapter 4
43 pages
Linear Regression
No ratings yet
Linear Regression
73 pages
Regression With A Single Regressor: Hypothesis Tests and Confidence Intervals
No ratings yet
Regression With A Single Regressor: Hypothesis Tests and Confidence Intervals
46 pages
Supplemental Material For Chapter 5 S5-1. S: XX XX X
No ratings yet
Supplemental Material For Chapter 5 S5-1. S: XX XX X
10 pages
CH 5 Slidesdd (1ddd)
No ratings yet
CH 5 Slidesdd (1ddd)
71 pages
BRM - L4,5 - Linear Regression
No ratings yet
BRM - L4,5 - Linear Regression
113 pages
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
No ratings yet
ECON6001: Applied Econometrics S&W: Chapter 4: Linear Regression With One Regressor, An Introduction Dr. Gedeon Lim
59 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
Tema I (Mínimos Cuadrados Ordinarios)
No ratings yet
Tema I (Mínimos Cuadrados Ordinarios)
49 pages
Multiple Regression Analysis: y + X + X + - . - X + U
No ratings yet
Multiple Regression Analysis: y + X + X + - . - X + U
43 pages
Basics of The OLS Estimator: Study Guide For The Midterm
No ratings yet
Basics of The OLS Estimator: Study Guide For The Midterm
7 pages
Ssss PDF
No ratings yet
Ssss PDF
50 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Econometrics for Finance Students
No ratings yet
Econometrics for Finance Students
64 pages
Bus 173 - Lecture 5
No ratings yet
Bus 173 - Lecture 5
38 pages
Ordinary Least Squares: Rómulo A. Chumacero
No ratings yet
Ordinary Least Squares: Rómulo A. Chumacero
50 pages
EC220/221 Introduction To Econometrics: Canh Thien Dang
No ratings yet
EC220/221 Introduction To Econometrics: Canh Thien Dang
30 pages
Tugas 6 Analisis Multivariat Data Panel
No ratings yet
Tugas 6 Analisis Multivariat Data Panel
11 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Module 4
No ratings yet
Module 4
36 pages
TCH442E Quantitative Methods For Finance: Last Lecture: Next
No ratings yet
TCH442E Quantitative Methods For Finance: Last Lecture: Next
13 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Problem Set 3 SOLUTIONS
No ratings yet
Problem Set 3 SOLUTIONS
7 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
B. Stat. (Hons.) Brochure ISI Kolkata
No ratings yet
B. Stat. (Hons.) Brochure ISI Kolkata
47 pages
Lecture 4 MLR - 1
No ratings yet
Lecture 4 MLR - 1
30 pages
Econometrics Chap 3
No ratings yet
Econometrics Chap 3
19 pages
Applications Chapter 4
No ratings yet
Applications Chapter 4
38 pages
Econ 399 Chapter2a
No ratings yet
Econ 399 Chapter2a
40 pages
Regression Analysis: Interpretation of Regression Model
100% (1)
Regression Analysis: Interpretation of Regression Model
22 pages
Regression With One Regressor-Hypothesis Tests and Confidence Intervals
100% (1)
Regression With One Regressor-Hypothesis Tests and Confidence Intervals
53 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
24 pages
Kernel Density Estimation
No ratings yet
Kernel Density Estimation
10 pages
IAES Tajikistan Day4
No ratings yet
IAES Tajikistan Day4
46 pages
StockWatson Econ CH04
No ratings yet
StockWatson Econ CH04
27 pages
Introduction To Kernel Smoothing
No ratings yet
Introduction To Kernel Smoothing
24 pages
Kalman Filtering & Data Fusion
100% (2)
Kalman Filtering & Data Fusion
70 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Uji Univariat
No ratings yet
Uji Univariat
2 pages
StockWatson Econ CH04
No ratings yet
StockWatson Econ CH04
27 pages
Simple Linear Regression - Lecture Notes
No ratings yet
Simple Linear Regression - Lecture Notes
19 pages
1 - The Simple Regression Model
No ratings yet
1 - The Simple Regression Model
41 pages
Testing The Assumptions of Linear Regression
100% (1)
Testing The Assumptions of Linear Regression
14 pages
Wooldridge 7e Ch17 SM
No ratings yet
Wooldridge 7e Ch17 SM
14 pages
L3 Linear Regression
No ratings yet
L3 Linear Regression
23 pages
STAT 310 Syllabus
No ratings yet
STAT 310 Syllabus
5 pages
Lecture Set 2
No ratings yet
Lecture Set 2
47 pages
Statistical Analysis of Kekerasan, Susut Bobot, and TPT
No ratings yet
Statistical Analysis of Kekerasan, Susut Bobot, and TPT
4 pages
Lecture Set 3
No ratings yet
Lecture Set 3
53 pages
Business Statistics Estimation Guide
No ratings yet
Business Statistics Estimation Guide
55 pages
Linear Regression Quiz
No ratings yet
Linear Regression Quiz
7 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Statistics 02
No ratings yet
Statistics 02
8 pages
2016 Midterm 1
No ratings yet
2016 Midterm 1
3 pages
Ee 769 Assignment 1 PDF
No ratings yet
Ee 769 Assignment 1 PDF
11 pages
Lampiran Shapiro Wilk
No ratings yet
Lampiran Shapiro Wilk
2 pages
CH 04 Wooldridge 5e ppt20250307
No ratings yet
CH 04 Wooldridge 5e ppt20250307
56 pages
Engineering Curve Fitting Guide
No ratings yet
Engineering Curve Fitting Guide
4 pages
Assignment
No ratings yet
Assignment
5 pages
Wind Load Evaluation of Wind Turbine Tower Design
No ratings yet
Wind Load Evaluation of Wind Turbine Tower Design
14 pages
Business Statistics Analysis
No ratings yet
Business Statistics Analysis
6 pages
Life Tables & Insurance Modeling
No ratings yet
Life Tables & Insurance Modeling
7 pages
HW2 Solution
No ratings yet
HW2 Solution
7 pages
Shrinkage Content
No ratings yet
Shrinkage Content
1 page
06 Nonlinear Regression Models
No ratings yet
06 Nonlinear Regression Models
57 pages
Multiple Regression Lecture Notes
No ratings yet
Multiple Regression Lecture Notes
46 pages
4.1 The Linear Regression Model: E (Tests
No ratings yet
4.1 The Linear Regression Model: E (Tests
16 pages
Birth of Krishna - Bhagawatam
No ratings yet
Birth of Krishna - Bhagawatam
6 pages
Econometrics Notes Heidelberg
No ratings yet
Econometrics Notes Heidelberg
62 pages
6-Econometrics-Linear Regression
No ratings yet
6-Econometrics-Linear Regression
16 pages
Pertemuan 2 - Simple Linear Regression
No ratings yet
Pertemuan 2 - Simple Linear Regression
24 pages
Metrics Topic6 Part1 Multipleregression
No ratings yet
Metrics Topic6 Part1 Multipleregression
33 pages
ZSMFG
No ratings yet
ZSMFG
43 pages
(Ebook PDF) Introduction To Econometrics 4Th Edition by James H. Stock Install Download
No ratings yet
(Ebook PDF) Introduction To Econometrics 4Th Edition by James H. Stock Install Download
52 pages
Wa0054.
No ratings yet
Wa0054.
1 page
Chapter 04 Linear Regression With One Regressor
No ratings yet
Chapter 04 Linear Regression With One Regressor
111 pages
EC311 Slides Spring25 Week11 Part2
No ratings yet
EC311 Slides Spring25 Week11 Part2
25 pages
Lecture 3 Ase
No ratings yet
Lecture 3 Ase
13 pages
Lecture 1
No ratings yet
Lecture 1
62 pages
Lecture 3
No ratings yet
Lecture 3
18 pages
Splines
No ratings yet
Splines
37 pages
Spline Regressions
No ratings yet
Spline Regressions
36 pages
Introduction To Survival Analysis
No ratings yet
Introduction To Survival Analysis
20 pages
Course Syllabus - Advanced Stats DATS 6450
No ratings yet
Course Syllabus - Advanced Stats DATS 6450
6 pages
Sorting Algorithms in General
No ratings yet
Sorting Algorithms in General
8 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Introduction to Linear Regression

Empirical problem: Class size and educational output

The California Test Score Data Set

The population regression line:

1 = slope of population regression line

By analogy, we will focus on the least squares

 The OLS estimator minimizes the average squared

Estimated slope = ˆ1 = – 2.28

One of the districts in the data set is Antelope, CA, for

Like estimation of the mean, we proceed in four steps:

The Population Linear Regression Model

Yi = 0 + 1Xi + ui, i = 1,…, n

 X is the independent variable/regressor

Data and sampling

Simple random sampling:

1. The conditional distribution of u given X has mean

So E(u|X=x) = 0 means E(Family Income|STR) = constant

This arises automatically if the entity (individual, district)

The main place we will encounter non-i.i.d. sampling is

Because Yi = 0 + 1Xi + ui, assumption #3 can

Assumption #3 is generally plausible. A finite domain of

Like Y , ˆ1 has a sampling distribution.

That is, ˆ1 is an unbiased estimator of 1.

(2) When n is large, the sampling distribution of ˆ1 is

Recall the summary of the sampling distribution of Y :

Under the three Least Squares Assumptions,

Suppose a skeptic suggests that reducing the number of

Null hypothesis and two-sided alternative:

Null hypothesis and one-sided alternative:

where 1 is the value of 1,0 hypothesized under the null

What is SE( ˆ1 )?

ˆ1  1,0 ˆ1  1,0

 Reject at 5% significance level if |t| > 1.96

Estimated regression line: TestScore = 698.9 – 2.28STR

Regression software reports the standard errors:

SE( ˆ0 ) = 10.4 SE( ˆ1 ) = 0.52

 The 1% 2-sided significance level is 2.58, so we reject

In general, if the sampling distribution of an estimator is

So: a 95% confidence interval for ˆ1 is,

SE( ˆ0 ) = 10.4 SE( ˆ1 ) = 0.52

95% confidence interval for ˆ1 :

{ ˆ1  1.96SE( ˆ1 )} = {–2.28  1.960.52}

Put standard errors in parentheses below the estimates

TestScore = 698.9 – 2.28STR

This expression means that:

 The standard error of ˆ1 is 0.52

Sometimes a regressor is binary:

So far, 1 has been called a “slope,” but that doesn’t

Yi = 0 + 1Xi + ui, where X is binary (Xi = 0 or 1):

The OLS estimate of the regression line relating

TestScore = 650.0 + 7.4D

Difference in means between groups = 7.4;

Estimation: Ysmall  Ylarge = 657.4 – 650.0 = 7.4

 0 = mean of Y given that X = 0

A natural question is how well the regression line “fits”

The R2 is the fraction of the sample variance of Yi

The standard error of the regression is (almost) the

This measures the same thing as the SER

Heteroskedasticity and Homoskedasticity

 What do these two terms mean?

 E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)

 E(u|X=x) = 0 (u satisfies Least Squares Assumption #1)

Hard to say…looks nearly homoskedastic, but the spread

Recall the three least squares assumptions:

Heteroskedasticity and homoskedasticity concern

If you don’t override the default and there is in fact

You might also like