Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views193 pages

Econometrics Slides Set1

The document outlines a course on Econometrics taught by Prof. Dr. Salmai Qari at Hochschule für Wirtschaft und Recht Berlin for the Summer term 2025. It introduces the Econometric Society, the integration of economic theory with statistical methods, and provides examples of questions suitable for econometric analysis. The course aims to equip students with skills to understand and assess empirical studies, covering topics such as data types, causal relationships, and regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views193 pages

Econometrics Slides Set1

The document outlines a course on Econometrics taught by Prof. Dr. Salmai Qari at Hochschule für Wirtschaft und Recht Berlin for the Summer term 2025. It introduces the Econometric Society, the integration of economic theory with statistical methods, and provides examples of questions suitable for econometric analysis. The course aims to equip students with skills to understand and assess empirical studies, covering topics such as data types, causal relationships, and regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 193

Econometrics

Econometrics
Prof. Dr. Salmai Qari

Hochschule für Wirtschaft und Recht Berlin


Summer term 2025

Qari Summer 2025


Econometrics

Econometrics?

Qari Summer 2025


Econometrics

Econometrics?

Econometric Society (established in 1930):


The Econometric Society is an international society for the advance-
ment of economic theory in its relation to statistics and mathematics. The
Society shall operate as a completely disinterested scientific organization
without political, social, financial, or nationalistic bias. Its main object
shall be to promote studies that aim at a unification of the theoretical-
quantitative and empirical-quantitative approach to economic problems
and that are penetrated by constructive and rigorous thinking similar to
that which has come to dominate in the natural sciences. Any activity
which promises ultimately to further such unification of theoretical and
factual studies in economics shall be within the sphere of interest of the
Society.

Qari Summer 2025


Econometrics

Econometrics?

Econometrics: Economics + Metrics


Econometric Society
Linking economic theory / reasoning with statistical methods

Qari Summer 2025


Econometrics

Econometrics?

Econometrics: Economics + Metrics


Econometric Society
Linking economic theory / reasoning with statistical methods
Further example: Biometrics
International Biometric Society (established in 1947)
First President of the International Biometric Society: Ronald Fisher
George W. Snedecor named the F -Test after Fisher
F -Distribution / ‘Snedecor’s F Distribution’ /
‘Fisher-Snedecor-F -Distribution’

Qari Summer 2025


Econometrics

Why choose this module?

Some examples of questions suitable for econometric analysis:

Qari Summer 2025


Econometrics

Why choose this module?

Some examples of questions suitable for econometric analysis:


What is the effect of smoking bans in restaurants
and bars on the businesses’ revenue and employment?
What is the effect of a minimum wage
on employment?
What will the unemployment rate
be next year?
What is the likelihood that a
client of a financial institution will
not be able to repay the loan?

Qari Summer 2025


Econometrics

Why choose this module?

Some examples of questions suitable for econometric analysis:



What is the effect of smoking bans in restaurants 


and bars on the businesses’ revenue and employment? 
Causal effect
What is the effect of a minimum wage 



on employment?
What will the unemployment rate



be next year?




What is the likelihood that a Forecasting

client of a financial institution will





not be able to repay the loan?

Qari Summer 2025


Econometrics

Aim of the course

Acquire the necessary skills to understand and assess empirical studies


The skills are also very valuable for almost any policy discussion
One recent example in the area of health1 :
“Eating breakfast was associated with significantly lower CHD (Coronary
Heart Disease) risk in this cohort of male health professionals”

1 Cahill et. al., Prospective Study of Breakfast Eating and Incident Coronary Heart Disease in

a Cohort of Male US Health Professionals, Circulation. 2013; 128: 337-343. doi:


10.1161/CIRCULATIONAHA.113.001474
Qari Summer 2025
Econometrics

Aim of the course

Acquire the necessary skills to understand and assess empirical studies


The skills are also very valuable for almost any policy discussion
One recent example in the area of health1 :
“Eating breakfast was associated with significantly lower CHD (Coronary
Heart Disease) risk in this cohort of male health professionals”
There are several ‘usual’ problems in this study, in particular the difficulty to
control for unobserved factors when comparing those who have breakfast and
those who skip it

1 Cahill et. al., Prospective Study of Breakfast Eating and Incident Coronary Heart Disease in

a Cohort of Male US Health Professionals, Circulation. 2013; 128: 337-343. doi:


10.1161/CIRCULATIONAHA.113.001474
Qari Summer 2025
Econometrics

Outline

Introduction (Wooldridge Ch. 1)


The simple linear model (Wooldridge Ch. 2)
The multiple linear regression model:
Estimation and inference (Wooldridge Ch. 3/4)
Functional form (e.g. logarithmic equations, Wooldridge Ch. 2/6):
Qualitative Information (e.g. dummy variables, Wooldridge Ch. 7)
Heteroscedasticity, Specification issues (Wooldridge Ch. 8,9)

Qari Summer 2025


The Nature of Econometrics
and Economic Data

Chapter 1

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e

Wooldridge (2013), Chapter 1. 1


The Nature of Econometrics
and Economic Data
What is econometrics?

Econometrics = use of statistical methods to analyze economic data

Econometricians typically analyze nonexperimental data

Typical goals of econometric analysis

Estimating relationships between economic variables

Testing economic theories and hypotheses

Forecasting economic variables

Evaluating and implementing government and business policy

Wooldridge (2013), Chapter 1. 2


The Nature of Econometrics
and Economic Data
Steps in econometric analysis

1) Economic model (this step is often skipped)

2) Econometric model

Economic models

Maybe micro- or macromodels

Often use optimizing behaviour, equilibrium modeling, …

Establish relationships between economic variables

Examples: demand equations, pricing equations, …

Wooldridge (2013), Chapter 1. 3


The Nature of Econometrics
and Economic Data
Model of job training and worker productivity
What is effect of additional training on worker productivity?
Formal economic theory not really needed to derive equation:

Hourly wage

Years of formal
education Weeks spent
Years of work- in job training
force experience

Other factors may be relevant, but these are the most important (?)

Wooldridge (2013), Chapter 1. 4


The Nature of Econometrics
and Economic Data
Econometric model of job training and worker productivity

Unobserved deter-
minants of the wage

e.g. innate ability,


Hourly wage Years of formal Years of work- Weeks spent quality of education,
education force experience in job training family background …

Most of econometrics deals with the specification of the error

Econometric models may be used for hypothesis testing

For example, the parameter represents effect of training on wage

How large is this effect? Is it different from zero?

Wooldridge (2013), Chapter 1. 5


The Nature of Econometrics
and Economic Data
Econometric analysis requires data

Different kinds of economic data sets

Cross-sectional data

Time series data

Pooled cross sections

Panel/Longitudinal data

Econometric methods depend on the nature of the data used

Use of inappropriate methods may lead to misleading results

Wooldridge (2013), Chapter 1. 6


The Nature of Econometrics
and Economic Data
Cross-sectional data sets

Sample of individuals, households, firms, cities, states, countries,

or other units of interest at a given point of time/in a given period

Cross-sectional observations are more or less independent

For example, pure random sampling from a population

Sometimes pure random sampling is violated, e.g. units refuse to

respond in surveys, or if sampling is characterized by clustering

Cross-sectional data typically encountered in applied microeconomics

Wooldridge (2013), Chapter 1. 7


The Nature of Econometrics
and Economic Data
Time series data
Observations of a variable or several variables over time

For example, stock prices, money supply, consumer price index,


gross domestic product, annual homicide rates, automobile sales, …

Time series observations are typically serially correlated

Ordering of observations conveys important information

Data frequency: daily, weekly, monthly, quarterly, annually, …

Typical features of time series: trends and seasonality

Typical applications: applied macroeconomics and finance

Wooldridge (2013), Chapter 1. 8


The Nature of Econometrics
and Economic Data
Pooled cross sections
Two or more cross sections are combined in one data set

Cross sections are drawn independently of each other

Pooled cross sections often used to evaluate policy changes

Example:

• Evaluate effect of change in property taxes on house prices

• Random sample of house prices for the year 1993

• A new random sample of house prices for the year 1995

• Compare before/after (1993: before reform, 1995: after reform)

Wooldridge (2013), Chapter 1. 9


The Nature of Econometrics
and Economic Data
Panel or longitudinal data
The same cross-sectional units are followed over time

Panel data have a cross-sectional and a time series dimension

Panel data can be used to account for time-invariant unobservables

Panel data can be used to model lagged responses

Example:

• City crime statistics; each city is observed in two years

• Time-invariant unobserved city characteristics may be modeled

• Effect of police on crime rates may exhibit time lag

Wooldridge (2013), Chapter 1. 10


The Nature of Econometrics
and Economic Data
Two-year panel data on city crime statistics

Each city has two time


series observations

Number of
police in 1986

Number of
police in 1990

Wooldridge (2013), Chapter 1. 11


The Nature of Econometrics
and Economic Data
Causality and the notion of ceteris paribus

Definition of causal effect of on :

"How does variable change if variable is changed


but all other relevant factors are held constant“

Most economic questions are ceteris paribus questions

It is important to define which causal effect one is interested in

It is useful to describe how an experiment would have to be


designed to infer the causal effect in question

Wooldridge (2013), Chapter 1. 12


The Nature of Econometrics
and Economic Data
Causal effect of fertilizer on crop yield
"By how much will the production of soybeans increase if one
increases the amount of fertilizer applied to the ground"
Implicit assumption: all other factors that influence crop yield such
as quality of land, rainfall, presence of parasites etc. are held fixed
Experiment:
Choose several one-acre plots of land; randomly assign different
amounts of fertilizer to the different plots; compare yields
Experiment works because amount of fertilizer applied is unrelated
to other factors influencing crop yields

Wooldridge (2013), Chapter 1. 13


The Nature of Econometrics
and Economic Data
Measuring the return to education
"If a person is chosen from the population and given another
year of education, by how much will his or her wage increase? "
Implicit assumption: all other factors that influence wages such as
experience, family background, intelligence etc. are held fixed
Experiment:
Choose a group of people; randomly assign different amounts of
education to them (infeasable!); compare wage outcomes
Problem without random assignment: amount of education is related
to other factors that influence wages (e.g. intelligence)

Wooldridge (2013), Chapter 1. 14


The Nature of Econometrics
and Economic Data
Effect of law enforcement on city crime level
"If a city is randomly chosen and given ten additional police officers,
by how much would its crime rate fall? "
Alternatively: "If two cities are the same in all respects, except that
city A has ten more police officers, by how much would the two cities
crime rates differ? "
Experiment:
Randomly assign number of police officers to a large number of cities
In reality, number of police officers will be determined by crime rate
(simultaneous determination of crime and number of police)

Wooldridge (2013), Chapter 1. 15


The Nature of Econometrics
and Economic Data
Effect of the minimum wage on unemployment
"By how much (if at all) will unemployment increase if the minimum
wage is increased by a certain amount (holding other things fixed)? "
Experiment:
Government randomly chooses minimum wage each year and
observes unemployment outcomes
Experiment will work because level of minimum wage is unrelated
to other factors determining unemployment
In reality, the level of the minimum wage will depend on political
and economic factors that also influence unemployment

Wooldridge (2013), Chapter 1. 16


The Simple
Regression Model

Chapter 2

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e

Wooldridge (2013), Chapter 2. 1


The Simple
Regression Model
Definition of the simple linear regression model

"Explains variable in terms of variable "

Intercept Slope parameter

Dependent variable,
explained variable, Error term,
Independent variable, disturbance,
response variable,… explanatory variable, unobservables,…
regressor,…

Wooldridge (2013), Chapter 2. 2


The Simple
Regression Model
Interpretation of the simple linear regression model

"Studies how varies with changes in :"

as long as

By how much does the dependent Interpretation only correct if all other
variable change if the independent things remain equal when the indepen-
variable is increased by one unit? dent variable is increased by one unit

The simple linear regression model is rarely applicable in prac-


tice but its discussion is useful for pedagogical reasons

Wooldridge (2013), Chapter 2. 3


The Simple
Regression Model
Example: Soybean yield and fertilizer

Rainfall,
land quality,
presence of parasites, …
Measures the effect of fertilizer on
yield, holding all other factors fixed

Example: A simple wage equation

Labor force experience,


tenure with current employer,
work ethic, intelligence …
Measures the change in hourly wage
given another year of education,
holding all other factors fixed

Wooldridge (2013), Chapter 2. 4


The Simple
Regression Model
When is there a causal interpretation?
Conditional mean independence assumption

The explanatory variable must not


contain information about the mean
of the unobserved factors

Example: wage equation

e.g. intelligence …

The conditional mean independence assumption is unlikely to hold because


individuals with more education will also be more intelligent on average.

Wooldridge (2013), Chapter 2. 5


The Simple
Regression Model
Population regression function (PFR)
The conditional mean independence assumption implies that

This means that the average value of the dependent variable


can be expressed as a linear function of the explanatory variable

Wooldridge (2013), Chapter 2. 6


The Simple
Regression Model

Population regression function

For individuals with , the


average value of is

Wooldridge (2013), Chapter 2. 7


The Simple
Regression Model
In order to estimate the regression model one needs data

A random sample of observations

First observation

Second observation

Third observation Value of the dependent


variable of the i-th ob-
Value of the expla-
servation
natory variable of
the i-th observation
n-th observation

Wooldridge (2013), Chapter 2. 8


The Simple
Regression Model
Fit as good as possible a regression line through the data points:

Fitted regression line


For example, the i-th
data point

Wooldridge (2013), Chapter 2. 9


Econometrics Chapter 2

Deriving the ordinary least squares (OLS) estimates

(1) Method of moments


(2) Define sum of squared residuals as a loss / penalty function and minimize
this function

Qari Summer 2025


Econometrics Chapter 2

Deriving the ordinary least squares (OLS) estimates - I

Define a fitted value for y when x = xi as:


yˆi = β̂0 + β̂1 x1 (1)

Define a residual for the observation i as the difference between the actual y1 and
its fitted value
ûi = yi − ŷi = yi − β̂0 − β̂1 x1 . (2)

Now chose βˆ0 and βˆ1 such that the sum of squared residuals:
Xn n
X
ûi 2 = (yi − β̂0 − β̂1 x1 )2 , (3)
i=1 i=1

is as small as possible, i.e.:


n
X
min (yi − β̂0 − β̂1 x1 )2 . (4)
β̂0 ,β̂1 i=1

Qari Summer 2025


Econometrics Chapter 2

Deriving the OLS estimates - II

In order to solve this minimization problem the partial derivatives of (8) with
respect to β̂0 and β̂1 must be zero
PN N
∂ i=1 ûi2 X
= −2 (yi − β̂0 − β̂1 xi ) = 0 (5)
∂ β̂0 i=1
PN N
∂ i=1 ûi2 X
= −2 (yi − β̂0 − β̂1 xi )xi = 0 (6)
∂ β̂1 i=1

Note that (9) can be written as

ȳ = β̂0 + β̂1 x̄ (7)


PN PN
where ȳ = n−1 i=1 yi and x̄ = n−1 i=1 xi

Qari Summer 2025


Econometrics Chapter 2

Graphical illustration of the OLS estimate

Figure: OLS minimizes the sum of the squared residuals

Qari Summer 2025


The Simple
Regression Model
What does "as good as possible" mean?
Regression residuals

Minimize sum of squared regression residuals

Ordinary Least Squares (OLS) estimates

Wooldridge (2013), Chapter 2. 10


The Simple
Regression Model
CEO Salary and return on equity

Salary in thousands of dollars Return on equity of the CEO‘s firm

Fitted regression

Intercept
If the return on equity increases by 1 unit,
then salary is predicted to change by 18,501 $
Causal interpretation?

Wooldridge (2013), Chapter 2. 11


The Simple
Regression Model

Fitted regression line


(depends on sample)

Unknown population regression line

Wooldridge (2013), Chapter 2. 12


The Simple
Regression Model
Wage and education

Hourly wage in dollars Years of education

Fitted regression

Intercept
In the sample, one more year of education was
associated with an increase in hourly wage by 0.54 $
Causal interpretation?

Wooldridge (2013), Chapter 2. 13


The Simple
Regression Model
Voting outcomes and campaign expenditures (two parties)

Percentage of vote for candidate A Percentage of campaign expenditures candidate A

Fitted regression

Intercept
If candidate A‘s share of spending increases by one
percentage point, he or she receives 0.464 percen-
Causal interpretation? tage points more of the total vote

Wooldridge (2013), Chapter 2. 14


The Simple
Regression Model
Properties of OLS on any sample of data
Fitted values and residuals

Fitted or predicted values Deviations from regression line (= residuals)

Algebraic properties of OLS regression

Deviations from regression Correlation between deviations Sample averages of y and


line sum up to zero and regressors is zero x lie on regression line

Wooldridge (2013), Chapter 2. 15


The Simple
Regression Model

For example, CEO number 12‘s salary was


526,023 $ lower than predicted using the
the information on his firm‘s return on equity

Wooldridge (2013), Chapter 2. 16


The Simple
Regression Model
Goodness-of-Fit

"How well does the explanatory variable explain the dependent variable?"

Measures of Variation

Total sum of squares, Explained sum of squares, Residual sum of squares,


represents total variation represents variation represents variation not
in dependent variable explained by regression explained by regression

Wooldridge (2013), Chapter 2. 17


The Simple
Regression Model
Decomposition of total variation

Total variation Explained part Unexplained part

Goodness-of-fit measure (R-squared)

R-squared measures the fraction of the


total variation that is explained by the
regression

Wooldridge (2013), Chapter 2. 18


Econometrics Chapter 2

Goodness of fit– -Stata


Goodness-of-fit Stataoutput
output

. reg wage school

Source | SS df MS Number of obs = 9027


-------------+------------------------------ F( 1, 9025) = 801.64
Model | 38088.4173 1 38088.4173 Prob > F = 0.0000
Residual | 428804.273 9025 47.5129389 R-squared = 0.0816
-------------+------------------------------ Adj R-squared = 0.0815
Total | 466892.69 9026 51.7275305 Root MSE = 6.893

------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
school | .8112824 .0286538 28.31 0.000 .7551145 .8674503
_cons | 2.728642 .3934673 6.93 0.000 1.957357 3.499927
------------------------------------------------------------------------------

SSE 38088.4173
R2 = SST = 466892.69 = 0.0816.
SSR 428804.273
R2 = 1 − SST = 1 − 466892.69 = 0.0816.

Qari Summer 2025


Econometrics Chapter 2

Goodness of fit– -Stata


Goodness-of-fit Stataoutput
output

. reg wage school

Source | SS df MS Number of obs = 9027


-------------+------------------------------ F( 1, 9025) = 801.64
Model | 38088.4173 1 38088.4173 Prob > F = 0.0000
Residual | 428804.273 9025 47.5129389 R-squared = 0.0816
-------------+------------------------------ Adj R-squared = 0.0815
Total | 466892.69 9026 51.7275305 Root MSE = 6.893

------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
school | .8112824 .0286538 28.31 0.000 .7551145 .8674503
_cons | 2.728642 .3934673 6.93 0.000 1.957357 3.499927
------------------------------------------------------------------------------

SSE
R 2R=
2 = SSE 38088.4173
SST = = 466892.69 = 0.0816
38088.4173
SST 466892.69 = 0.0816.
SSR 428804.273
R2 =2 1 − SSTSSR − 466892.69
= 1 = 0.0816
428804.273
R =1− SST =1− 466892.69 = 0.0816.

Qari Summer 2025


The Simple
Regression Model
CEO Salary and return on equity

The regression explains only 1.3 %


of the total variation in salaries

Voting outcomes and campaign expenditures

The regression explains 85.6 % of the


total variation in election outcomes

Caution: A high R-squared does not necessarily mean that the


regression has a causal interpretation!

Wooldridge (2013), Chapter 2. 19


The Simple
Regression Model
Incorporating nonlinearities: Semi-logarithmic form
Regression of log wages on years of eduction

Natural logarithm of wage

This changes the interpretation of the regression coefficient:

Percentage change of wage

… if years of education
are increased by one year

Wooldridge (2013), Chapter 2. 20


The Simple
Regression Model
Fitted regression

The wage increases by 8.3 % for


every additional year of education
(= return to education)

For example:

Growth rate of wage is 8.3 %


per year of education

Wooldridge (2013), Chapter 2. 21


The Simple
Regression Model
Incorporating nonlinearities: Log-logarithmic form
CEO salary and firm sales

Natural logarithm of CEO salary Natural logarithm of his/her firm‘s sales

This changes the interpretation of the regression coefficient:

Percentage change of salary


… if sales increase by 1 %

Logarithmic changes are


always percentage changes

Wooldridge (2013), Chapter 2. 22


The Simple
Regression Model
CEO salary and firm sales: fitted regression

For example: + 1 % sales ! + 0.257 % salary

The log-log form postulates a constant elasticity model,


whereas the semi-log form assumes a semi-elasticity model

Wooldridge (2013), Chapter 2. 23


The Simple
Regression Model
Expected values and variances of the OLS estimators
The estimated regression coefficients are random variables
because they are calculated from a random sample

Data is random and depends on particular sample that has been drawn

The question is what the estimators will estimate on average


and how large their variability in repeated samples is

Wooldridge (2013), Chapter 2. 24


The Simple
Regression Model
Standard assumptions for the linear regression model

Assumption SLR.1 (Linear in parameters)

In the population, the relationship


between y and x is linear

Assumption SLR.2 (Random sampling)

The data is a random sample


drawn from the population

Each data point therefore follows


the population equation

Wooldridge (2013), Chapter 2. 25


The Simple
Regression Model
Discussion of random sampling: Wage and education
The population consists, for example, of all workers of country A
In the population, a linear relationship between wages (or log wages)
and years of education holds
Draw completely randomly a worker from the population
The wage and the years of education of the worker drawn are random
because one does not know beforehand which worker is drawn

Throw back worker into population and repeat random draw times

The wages and years of education of the sampled workers are used to
estimate the linear relationship between wages and education

Wooldridge (2013), Chapter 2. 26


The Simple
Regression Model

The values drawn


for the i-th worker

The implied deviation


from the population
relationship for
the i-th worker:

Wooldridge (2013), Chapter 2. 27


The Simple
Regression Model
Assumptions for the linear regression model (cont.)

Assumption SLR.3 (Sample variation in explanatory variable)

The values of the explanatory variables are not all


the same (otherwise it would be impossible to stu-
dy how different values of the explanatory variable
lead to different values of the dependent variable)

Assumption SLR.4 (Zero conditional mean)

The value of the explanatory variable must


contain no information about the mean of
the unobserved factors

Wooldridge (2013), Chapter 2. 28


The Simple
Regression Model
Theorem 2.1 (Unbiasedness of OLS)

Interpretation of unbiasedness
The estimated coefficients may be smaller or larger, depending on
the sample that is the result of a random draw
However, on average, they will be equal to the values that charac-
terize the true relationship between y and x in the population
"On average" means if sampling was repeated, i.e. if drawing the
random sample und doing the estimation was repeated many times
In a given sample, estimates may differ considerably from true values

Wooldridge (2013), Chapter 2. 29


The Simple
Regression Model
Variances of the OLS estimators
Depending on the sample, the estimates will be nearer or farther
away from the true population values
How far can we expect our estimates to be away from the true
population values on average (= sampling variability)?
Sampling variability is measured by the estimator‘s variances

Assumption SLR.5 (Homoskedasticity)

The value of the explanatory variable must


contain no information about the variability
of the unobserved factors

Wooldridge (2013), Chapter 2. 30


The Simple
Regression Model
Graphical illustration of homoskedasticity

The variability of the unobserved


influences does not dependent on
the value of the explanatory variable

Wooldridge (2013), Chapter 2. 31


The Simple
Regression Model
An example for heteroskedasticity: Wage and education

The variance of the unobserved


determinants of wages increases
with the level of education

Wooldridge (2013), Chapter 2. 32


The Simple
Regression Model
Theorem 2.2 (Variances of OLS estimators)

Under assumptions SLR.1 – SLR.5:

Conclusion:
The sampling variability of the estimated regression coefficients will be
the higher the larger the variability of the unobserved factors, and the
lower, the higher the variation in the explanatory variable

Wooldridge (2013), Chapter 2. 33


The Simple
Regression Model
Estimating the error variance

The variance of u does not depend on x,


i.e. is equal to the unconditional variance

One could estimate the variance of the


errors by calculating the variance of the
residuals in the sample; unfortunately
this estimate would be biased

An unbiased estimate of the error variance can be obtained by


substracting the number of estimated regression coefficients
from the number of observations

Wooldridge (2013), Chapter 2. 34


The Simple
Regression Model
Theorem 2.3 (Unbiasedness of the error variance)

Calculation of standard errors for regression coefficients

Plug in for
the unknown

The estimated standard deviations of the regression coefficients are called "standard
errors". They measure how precisely the regression coefficients are estimated.

Wooldridge (2013), Chapter 2. 35


The Simple Regression Model (35 of 39)

• Regression on a binary explanatory variable


• Suppose that x is either equal to 0 or 1

• This regression allows the mean value of y to differ depending on the


state of x

• Note that the statistical properties of OLS are no different when x is


binary
36
The Simple Regression Model (36 of 39)

• Counterfactual outcomes, causality and policy analysis


• In policy analysis, define a treatment effect as:

• Note that we will never actually observe this since we either observe yi(1)
or yi(0) for a given i, but never both.

• Let the average treatment effect be defined as:

37
The Simple Regression Model (37 of 39)

• Counterfactual outcomes, causality and policy analysis (contd.)


• Let xi be a binary policy variable.

• This can be written as:

• Therefore, regressing y on x will give us an estimate of the (constant)


treatment effect.
• As long as we have random assignment, OLS will yield an unbiased
estimator for the treatment effect τ.
38
The Simple Regression Model (38 of 39)

• Random assignment
• Subjects are randomly assigned into treatment and control groups such that
there are no systematic differences between the two groups other than the
treatment.
• In practice, randomized control trials (RCTs) are expensive to implement and
may raise ethical issues.
• Though RCTs are often not feasible in economics, it is useful to think about
the kind of experiment you would run if random assignment was a possibility.
This helps in identifying the potential impediments to random assignment
(that we could conceivable control for in a multivariate regression).

39
The Simple Regression Model (39 of 39)

• Example: The effects of a job training program on earnings


• Real earnings are regressed on a binary variable indicating
participation in a job training program.

• Those who participated in the training program have earnings $1,790


higher than those who did not participate.
• This represents a 39.3% increase over the $4,550 average earnings
from those who did not participate.

40
Multiple Regression
Analysis: Estimation

Chapter 3

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e

Wooldridge (2013), Chapter 3. 1


Multiple Regression
Analysis: Estimation
Definition of the multiple linear regression model

"Explains variable in terms of variables "

Intercept Slope parameters

Dependent variable,
explained variable, Error term,
Independent variables, disturbance,
response variable,… explanatory variables, unobservables,…
regressors,…

Wooldridge (2013), Chapter 3. 2


Multiple Regression
Analysis: Estimation
Motivation for multiple regression
Incorporate more explanatory factors into the model
Explicitly hold fixed other factors that otherwise would be in
Allow for more flexible functional forms

Example: Wage equation

Now measures effect of education explicitly holding experience fixed

All other factors…

Hourly wage Years of education Labor market experience

Wooldridge (2013), Chapter 3. 3


Multiple Regression
Analysis: Estimation
Example: Average test scores and per student spending

Other factors

Average standardized Per student spending Average family income


test score of school at this school of students at this school

Per student spending is likely to be correlated with average family


income at a given high school because of school financing
Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores
In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores

Wooldridge (2013), Chapter 3. 4


Multiple Regression
Analysis: Estimation
Example: Family income and family consumption

Other factors

Family consumption Family income Family income squared

Model has two explanatory variables: income and income squared


Consumption is explained as a quadratic function of income
One has to be very careful when interpreting the coefficients:

By how much does consumption Depends on how


increase if income is increased much income is
by one unit? already there

Wooldridge (2013), Chapter 3. 5


Multiple Regression
Analysis: Estimation
Example: CEO salary, sales and CEO tenure

Log of CEO salary Log sales Quadratic function of CEO tenure with firm

Model assumes a constant elasticity relationship between CEO salary


and the sales of his or her firm
Model assumes a quadratic relationship between CEO salary and his
or her tenure with the firm
Meaning of "linear" regression
The model has to be linear in the parameters (not in the variables)

Wooldridge (2013), Chapter 3. 6


Multiple Regression
Analysis: Estimation
OLS Estimation of the multiple regression model

Random sample

Regression residuals

Minimize sum of squared residuals

Minimization will be carried out by computer

Wooldridge (2013), Chapter 3. 7


Multiple Regression
Analysis: Estimation
Interpretation of the multiple regression model

By how much does the dependent variable change if the j-th


independent variable is increased by one unit, holding all
other independent variables and the error term constant

The multiple linear regression model manages to hold the values


of other explanatory variables fixed even if, in reality, they are
correlated with the explanatory variable under consideration
"Ceteris paribus"-interpretation
It has still to be assumed that unobserved factors do not change if
the explanatory variables are changed

Wooldridge (2013), Chapter 3. 8


Multiple Regression
Analysis: Estimation
Example: Determinants of college GPA

Grade point average at college High school grade point average Achievement test score

Interpretation
Holding ACT fixed, another point on high school grade point average
is associated with another .453 points college grade point average
Or: If we compare two students with the same ACT, but the hsGPA of
student A is one point higher, we predict student A to have a colGPA
that is .453 higher than that of student B
Holding high school grade point average fixed, another 10 points on
ACT are associated with less than one point on college GPA
Wooldridge (2013), Chapter 3. 9
Multiple Regression
Analysis: Estimation
"Partialling out" interpretation of multiple regression
One can show that the estimated coefficient of an explanatory
variable in a multiple regression can be obtained in two steps:
1) Regress the explanatory variable on all other explanatory variables
2) Regress on the residuals from this regression
Why does this procedure work?
The residuals from the first regression is the part of the explanatory
variable that is uncorrelated with the other explanatory variables
The slope coefficient of the second regression therefore represents
the isolated effect of the explanatory variable on the dep. variable

Wooldridge (2013), Chapter 3. 10


Multiple Regression
Analysis: Estimation
Properties of OLS on any sample of data
Fitted values and residuals

Fitted or predicted values Residuals

Algebraic properties of OLS regression

Deviations from regression Correlations between deviations Sample averages of y and of the
line sum up to zero and regressors are zero regressors lie on regression line

Wooldridge (2013), Chapter 3. 11


Multiple Regression
Analysis: Estimation
Goodness-of-Fit

Decomposition of total variation

Notice that R-squared can only


increase if another explanatory
variable is added to the regression
R-squared

Alternative expression for R-squared R-squared is equal to the squared


correlation coefficient between the
actual and the predicted value of
the dependent variable

Wooldridge (2013), Chapter 3. 12


Multiple Regression
Analysis: Estimation
Example: Explaining arrest records

Number of times Proportion prior arrests Months in prison 1986 Quarters employed 1986
arrested 1986 that led to conviction
(proxy for likelihood of conv.)

Interpretation:
Proportion prior arrests +0.5 ! -.075 = -7.5 arrests per 100 men
Months in prison +12 ! -.034(12) = -0.408 arrests for given man
Quarters employed +1 ! -.104 = -10.4 arrests per 100 men

Wooldridge (2013), Chapter 3. 13


Multiple Regression
Analysis: Estimation
Example: Explaining arrest records (cont.)
An additional explanatory variable is added:

Average sentence in prior convictions

R-squared increases only slightly


Interpretation:
Average prior sentence increases number of arrests (?)
Limited additional explanatory power as R-squared increases by little
General remark on R-squared
Even if R-squared is small (as in the given example), regression may
still provide good estimates of ceteris paribus effects
Wooldridge (2013), Chapter 3. 14
Multiple Regression
Analysis: Estimation
Standard assumptions for the multiple regression model

Assumption MLR.1 (Linear in parameters)


In the population, the relation-
ship between y and the expla-
natory variables is linear

Assumption MLR.2 (Random sampling)

The data is a random sample


drawn from the population

Each data point therefore follows the population equation

Wooldridge (2013), Chapter 3. 15


Multiple Regression
Analysis: Estimation
Standard assumptions for the multiple regression model (cont.)

Assumption MLR.3 (No perfect collinearity)


"In the sample (and therefore in the population), none
of the independent variables is constant and there are
no exact relationships among the independent variables"

Remarks on MLR.3
The assumption only rules out perfect collinearity/correlation bet-
ween explanatory variables; imperfect correlation is allowed
If an explanatory variable is a perfect linear combination of other
explanatory variables it is superfluous and may be eliminated
Constant variables are also ruled out (collinear with intercept)

Wooldridge (2013), Chapter 3. 16


Multiple Regression
Analysis: Estimation
Example for perfect collinearity: small sample

In a small sample, avginc may accidentally be an exact multiple of expend; it will not
be possible to disentangle their separate effects because there is exact covariation

Example for perfect collinearity: relationships between regressors

Either shareA or shareB will have to be dropped from the regression because there
is an exact linear relationship between them: shareA + shareB = 1

Wooldridge (2013), Chapter 3. 17


Multiple Regression
Analysis: Estimation
Standard assumptions for the multiple regression model (cont.)
Assumption MLR.4 (Zero conditional mean)

The value of the explanatory variables


must contain no information about the
mean of the unobserved factors

In a multiple regression model, the zero conditional mean assumption


is much more likely to hold because fewer things end up in the error
Example: Average test scores

If avginc was not included in the regression, it would end up in the error term;
it would then be hard to defend that expend is uncorrelated with the error

Wooldridge (2013), Chapter 3. 18


Multiple Regression
Analysis: Estimation
Discussion of the zero mean conditional assumption
Explanatory variables that are correlated with the error term are
called endogenous; endogeneity is a violation of assumption MLR.4
Explanatory variables that are uncorrelated with the error term are
called exogenous; MLR.4 holds if all explanat. var. are exogenous
Exogeneity is the key assumption for a causal interpretation of the
regression, and for unbiasedness of the OLS estimators

Theorem 3.1 (Unbiasedness of OLS)

Unbiasedness is an average property in repeated samples; in a given


sample, the estimates may still be far away from the true values

Wooldridge (2013), Chapter 3. 19


Multiple Regression
Analysis: Estimation
Including irrelevant variables in a regression model

No problem because . = 0 in the population

However, including irrevelant variables may increase sampling variance.

Omitting relevant variables: the simple case

True model (contains x1 and x2)

Estimated model (x2 is omitted)

Wooldridge (2013), Chapter 3. 20


Multiple Regression
Analysis: Estimation
Omitted variable bias
If x1 and x2 are correlated, assume a linear
regression relationship between them

If y is only regressed If y is only regressed error term


on x1 this will be the on x1, this will be the
estimated intercept estimated slope on x1

Conclusion: All estimated coefficients will be biased

Wooldridge (2013), Chapter 3. 21


Multiple Regression
Analysis: Estimation
Example: Omitting ability in a wage equation

Will both be positive

The return to education will be overestimated because . It will look


as if people with many years of education earn very high wages, but this is partly
due to the fact that people with more education are also more able on average.

When is there no omitted variable bias?


If the omitted variable is irrelevant or uncorrelated

Wooldridge (2013), Chapter 3. 22


Multiple Regression
Analysis: Estimation
Omitted variable bias: more general cases

True model (contains x1, x2 and x3)

Estimated model (x3 is omitted)

No general statements possible about direction of bias


Analysis as in simple case if one regressor uncorrelated with others
Example: Omitting ability in a wage equation

If exper is approximately uncorrelated with educ and abil, then the direction
of the omitted variable bias can be as analyzed in the simple two variable case.

Wooldridge (2013), Chapter 3. 23


Multiple Regression
Analysis: Estimation
Standard assumptions for the multiple regression model (cont.)
Assumption MLR.5 (Homoscedasticity)

The value of the explanatory variables


must contain no information about the
variance of the unobserved factors

Example: Wage equation


This assumption may also be hard
to justify in many cases

Short hand notation All explanatory variables are


collected in a random vector

with

Wooldridge (2013), Chapter 3. 24


Multiple Regression
Analysis: Estimation
Theorem 3.2 (Sampling variances of OLS slope estimators)

Under assumptions MLR.1 – MLR.5:

Variance of the error term

Total sample variation in R-squared from a regression of explanatory variable


explanatory variable xj: xj on all other independent variables
(including a constant)

Wooldridge (2013), Chapter 3. 25


Multiple Regression
Analysis: Estimation
Components of OLS Variances:
1) The error variance
A high error variance increases the sampling variance because there is
more "noise" in the equation
A large error variance necessarily makes estimates imprecise
The error variance does not decrease with sample size
2) The total sample variation in the explanatory variable
More sample variation leads to more precise estimates
Total sample variation automatically increases with the sample size
Increasing the sample size is thus a way to get more precise estimates

Wooldridge (2013), Chapter 3. 26


Multiple Regression
Analysis: Estimation
3) Linear relationships among the independent variables

Regress on all other independent variables (including a constant)

The R-squared of this regression will be the higher


the better xj can be linearly explained by the other
independent variables

Sampling variance of will be the higher the better explanatory


variable can be linearly explained by other independent variables
The problem of almost linearly dependent explanatory variables is
called multicollinearity (i.e. for some )

Wooldridge (2013), Chapter 3. 27


Multiple Regression
Analysis: Estimation
An example for multicollinearity

Average standardized Expenditures Expenditures for in- Other ex-


test score of school for teachers structional materials penditures

The different expenditure categories will be strongly correlated because if a school has a lot
of resources it will spend a lot on everything.

It will be hard to estimate the differential effects of different expenditure categories because
all expenditures are either high or low. For precise estimates of the differential effects, one
would need information about situations where expenditure categories change differentially.

As a consequence, sampling variance of the estimated effects will be large.

Wooldridge (2013), Chapter 3. 28


Multiple Regression
Analysis: Estimation
Discussion of the multicollinearity problem
In the above example, it would probably be better to lump all expen-
diture categories together because effects cannot be disentangled
In other cases, dropping some independent variables may reduce
multicollinearity (but this may lead to omitted variable bias)
Only the sampling variance of the variables involved in multicollinearity
will be inflated; the estimates of other effects may be very precise
Note that multicollinearity is not a violation of MLR.3 in the strict sense
Multicollinearity may be detected through "variance inflation factors"

As an (arbitrary) rule of thumb, the variance


inflation factor should not be larger than 10

Wooldridge (2013), Chapter 3. 29


Multiple Regression
Analysis: Estimation
Variances in misspecified models
The choice of whether to include a particular variable in a regression
can be made by analyzing the tradeoff between bias and variance

True population model

Estimated model 1

Estimated model 2

It might be the case that the likely omitted variable bias in the
misspecified model 2 is overcompensated by a smaller variance

Wooldridge (2013), Chapter 3. 30


Multiple Regression
Analysis: Estimation
Variances in misspecified models (cont.)

Conditional on x1 and x2 , the


variance in model 2 is always
smaller than that in model 1

Case 1: Conclusion: Do not include irrelevant regressors

Case 2: Trade off bias and variance; Caution: bias will not vanish even in large samples

Wooldridge (2013), Chapter 3. 31


Multiple Regression
Analysis: Estimation
Estimating the error variance

An unbiased estimate of the error variance can be obtained by substracting the number of
estimated regression coefficients from the number of observations. The number of obser-
vations minus the number of estimated parameters is also called the degrees of freedom.
The n estimated squared residuals in the sum are not completely independent but related
through the k+1 equations that define the first order conditions of the minimization problem.

Theorem 3.3 (Unbiased estimator of the error variance)

Wooldridge (2013), Chapter 3. 32


Multiple Regression
Analysis: Estimation
Estimation of the sampling variances of the OLS estimators

The true sampling


variation of the
estimated

Plug in for the unknown

The estimated samp-


ling variation of the
estimated

Note that these formulas are only valid under assumptions


MLR.1-MLR.5 (in particular, there has to be homoscedasticity)

Wooldridge (2013), Chapter 3. 33


Multiple Regression
Analysis: Estimation
Efficiency of OLS: The Gauss-Markov Theorem
Under assumptions MLR.1 - MLR.5, OLS is unbiased
However, under these assumptions there may be many other
estimators that are unbiased
Which one is the unbiased estimator with the smallest variance?
In order to answer this question one usually limits oneself to linear
estimators, i.e. estimators linear in the dependent variable

May be an arbitrary function of the sample values


of all the explanatory variables; the OLS estimator
can be shown to be of this form

Wooldridge (2013), Chapter 3. 34


Multiple Regression
Analysis: Estimation
Theorem 3.4 (Gauss-Markov Theorem)
Under assumptions MLR.1 - MLR.5, the OLS estimators are the best
linear unbiased estimators (BLUEs) of the regression coefficients, i.e.

for all for which .

OLS is only the best estimator if MLR.1 – MLR.5 hold; if there is


heteroscedasticity for example, there are better estimators.

Wooldridge (2013), Chapter 3. 35


Multiple Regression
Analysis: Inference

Chapter 4

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e

Wooldridge (2013), Chapter 4. 1


Multiple Regression
Analysis: Inference
Statistical inference in the regression model
Hypothesis tests about population parameters
Construction of confidence intervals

Sampling distributions of the OLS estimators

The OLS estimators are random variables

We already know their expected values and their variances

However, for hypothesis tests we need to know their distribution

In order to derive their distribution we need additional assumptions

Assumption about distribution of errors: normal distribution

Wooldridge (2013), Chapter 4. 2


Multiple Regression
Analysis: Inference
Assumption MLR.6 (Normality of error terms)

independently of

It is assumed that the unobserved


factors are normally distributed around
the population regression function.

The form and the variance of the


distribution does not depend on
any of the explanatory variables.

It follows that:

Wooldridge (2013), Chapter 4. 3


Multiple Regression
Analysis: Inference
Discussion of the normality assumption
The error term is the sum of "many" different unobserved factors
Sums of independent factors are normally distributed (CLT)
Problems:
• How many different factors? Number large enough?
• Possibly very heterogenous distributions of individual factors
• How independent are the different factors?
The normality of the error term is an empirical question
At least the error distribution should be "close" to normal
In many cases, normality is questionable or impossible by definition

Wooldridge (2013), Chapter 4. 4


Multiple Regression
Analysis: Inference
Discussion of the normality assumption (cont.)
Examples where normality cannot hold:
• Wages (nonnegative; also: minimum wage)
• Number of arrests (takes on a small number of integer values)
• Unemployment (indicator variable, takes on only 1 or 0)
In some cases, normality can be achieved through transformations
of the dependent variable (e.g. use log(wage) instead of wage)
Under normality, OLS is the best (even nonlinear) unbiased estimator
Important: For the purposes of statistical inference, the assumption
of normality can be replaced by a large sample size

Wooldridge (2013), Chapter 4. 5


Multiple Regression
Analysis: Inference
Terminology

"Gauss-Markov assumptions" "Classical linear model (CLM) assumptions"

Theorem 4.1 (Normal sampling distributions)

Under assumptions MLR.1 – MLR.6:

The estimators are normally distributed The standardized estimators follow a


around the true parameters with the standard normal distribution
variance that was derived earlier

Wooldridge (2013), Chapter 4. 6


Multiple Regression
Analysis: Inference
Testing hypotheses about a single population parameter
Theorem 4.1 (t-distribution for standardized estimators)

Under assumptions MLR.1 – MLR.6:

If the standardization is done using the estimated


standard deviation (= standard error), the normal
distribution is replaced by a t-distribution

Note: The t-distribution is close to the standard normal distribution if n-k-1 is large.

Null hypothesis (for more general hypotheses, see below)


The population parameter is equal to zero, i.e. after
controlling for the other independent variables, there
is no effect of xj on y

Wooldridge (2013), Chapter 4. 7


Multiple Regression
Analysis: Inference
t-statistic (or t-ratio)
One should be very careful about statements like this:
„The farther the estimated coefficient is away from zero, the
less likely it is that the null hypothesis holds true.“ Further
question: what does "far" away from zero mean?

This depends on the variability of the estimated coefficient, i.e. its


standard deviation. The t-statistic measures how many estimated
standard deviations the estimated coefficient is away from zero.

Distribution of the t-statistic if the null hypothesis is true

Goal: Define a rejection rule so that, if it is true, H0 is rejected


only with a small probability (= significance level, e.g. 5%)

Wooldridge (2013), Chapter 4. 8


Multiple Regression
Analysis: Inference
Testing against one-sided alternatives (greater than zero)

Test against .

Reject the null hypothesis in favour of the


alternative hypothesis if the estimated coef-
ficient is „too large“ (i.e. larger than a criti-
cal value).

Construct the critical value so that, if the


null hypothesis is true, it is rejected in,
for example, 5% of the cases.

Example with 28 degrees of freedom:

! Reject if t-statistic greater than 1.701

Wooldridge (2013), Chapter 4. 9


Multiple Regression
Analysis: Inference
Example: Wage equation
Test whether, after controlling for education and tenure, higher work
experience leads to higher hourly wages

Standard errors

Test against .

One would either expect a positive effect of experience on hourly wage or no effect at all.

Wooldridge (2013), Chapter 4. 10


Multiple Regression
Analysis: Inference
Example: Wage equation (cont.)
t-statistic

Degrees of freedom;
here the standard normal
approximation applies

Critical values for the 5% and the 1% significance level (these


are conventional significance levels).

The null hypothesis is rejected because the t-statistic exceeds


the critical value.

The usual (problematic) parlance:


"The effect of experience on hourly wage is statistically greater than zero at the 5%
(and even at the 1%) significance level."

Wooldridge (2013), Chapter 4. 11


Multiple Regression
Analysis: Inference
Testing against one-sided alternatives (less than zero)

Test against .

Reject the null hypothesis in favour of the


alternative hypothesis if the estimated coef-
ficient is „too small“ (i.e. smaller than a criti-
cal value).

Construct the critical value so that, if the


null hypothesis is true, it is rejected in,
for example, 5% of the cases.

Example with 18 degrees of freedom:

! Reject if t-statistic less than -1.734

Wooldridge (2013), Chapter 4. 12


Multiple Regression
Analysis: Inference
Example: Student performance and school size
Test whether smaller school size leads to better student performance

Percentage of students Average annual tea- Staff per one thou- School enrollment
passing maths test cher compensation sand students (= school size)

Test against .

Do larger schools hamper student performance or is there no such effect?

Wooldridge (2013), Chapter 4. 13


Multiple Regression
Analysis: Inference
Example: Student performance and school size (cont.)
t-statistic

Degrees of freedom;
here the standard normal
approximation applies

Critical values for the 5% and the 15% significance level


(two examples).

The null hypothesis is not rejected because the t-statistic is


not smaller than the critical value.

One cannot reject the hypothesis that there is no effect of school size on student performance
(for a significance level of 15% and of course 5%).

Wooldridge (2013), Chapter 4. 14


Multiple Regression
Analysis: Inference
Testing against two-sided alternatives

Test against .

Reject the null hypothesis in favour of the


alternative hypothesis if the absolute value
of the estimated coefficient is too large.

Construct the critical value so that, if the


null hypothesis is true, it is rejected in,
for example, 5% of the cases.

Example with 25 degrees of freedom:

! Reject if absolute value of t-statistic is less than


-2.06 or greater than 2.06

Wooldridge (2013), Chapter 4. 15


Multiple Regression
Analysis: Inference
Example: Determinants of college GPA Lectures missed per week

For critical values, use standard normal distribution

Once again the usual parlance:

„The effects of hsGPA and skipped are


significantly different from zero at the
1% significance level. The effect of ACT
is not significantly different from zero,
not even at the 10% significance level.“

Wooldridge (2013), Chapter 4. 16


Multiple Regression
Analysis: Inference
"Statistically significant“ variables in a regression
If a regression coefficient is different from zero in a two-sided test, the
corresponding variable is often said to be "statistically significant“
If the number of degrees of freedom is large enough so that the nor-
mal approximation applies, the following rules of thumb apply:

"statistically significant at 10 % level“

"statistically significant at 5 % level“

"statistically significant at 1 % level“

Wooldridge (2013), Chapter 4. 17


Multiple Regression
Analysis: Inference
Economic and statistical significance
It is important to discuss the magnitude of the coefficient to get an
idea of its economic or practical importance
The fact that a coefficient is statistically significant does not necessa-
rily mean it is economically or practically significant!
If a variable is statistically and economically important but has the
"wrong“ sign, the regression model might be misspecified

Wooldridge (2013), Chapter 4. 18


Multiple Regression
Analysis: Inference
The Rhetoric of Significance Tests (in Economics and other
disciplines)
The chosen level of a test depends on the problem at hand
Given the problem at hand, the researcher / analyst has to choose
acceptable Type 1 and type 2 errors
The level is chosen before the data is analyzed / the experiment is
conducted
The usual significance levels (10%, 5%, 1%) are mere conventions

Wooldridge (2013), Chapter 4. 19


Multiple Regression
Analysis: Inference
The Rhetoric of Significance Tests (in Economics and other
disciplines)
Statements like „statistically significant at 10 % level“ are problematic;
the level of test is a property of the problem at hand and not a sample
characteristic
Unfortunately, these issues are rarely discussed (and Wooldridge
(2013) is no exception)
Further reading:
• Ioannidis JPA (2005). Why Most Published Research Findings Are False. PLoS Med 2(8): e124.

doi:10.1371/journal.pmed.0020124

• Donald N. McCloskey (1985). The Loss Function Has Been Mislaid: The Rhetoric of Significance

Tests, The American Economic Review , Vol. 75, No. 2 (P&P)


Wooldridge (2013), Chapter 4. 20
Multiple Regression
Analysis: Inference
Testing more general hypotheses about a regression coefficient
Null hypothesis
Hypothesized value of the coefficient

t-statistic

The test works exactly as before, except that the hypothesized


value is substracted from the estimate when forming the statistic

Wooldridge (2013), Chapter 4. 21


Multiple Regression
Analysis: Inference
Example: Campus crime and enrollment
An interesting hypothesis is whether crime increases by one percent
if enrollment is increased by one percent

Estimate is different from


one but what about the
precision of this
estimate?

The hypothesis is
rejected at the 5%
level

Wooldridge (2013), Chapter 4. 22


Multiple Regression
Analysis: Inference
Computing p-values for t-tests
If the significance level is made smaller and smaller, there will be a
point where the null hypothesis cannot be rejected anymore
The reason is that, by lowering the significance level, one wants to
avoid more and more to make the error of rejecting a correct H0
The smallest significance level at which the null hypothesis is still
rejected, is called the p-value of the hypothesis test
Once again be careful about statements like these:
„A small p-value is evidence against the null hypothesis because one
would reject the null hypothesis even at small significance levels“
„A large p-value is evidence in favor of the null hypothesis“

Wooldridge (2013), Chapter 4. 23


Multiple Regression
Analysis: Inference
How the p-value is computed (here: two-sided test)

The p-value is the probability of obtaining a


test statistic at least as “extreme” (pointing
towards rejection) as the one that was actually
observed
These would be the
critical values for a
In the two-sided case, the p-value is thus the
5% significance level probability that the t-distributed variable takes
on a larger absolute value than the realized
value of the test statistic, e.g.:

Hence, a null hypothesis is rejected if and only


if the corresponding p-value is smaller than
the significance level.
value of test statistic
For example, for a significance level of 5% the
t-statistic would not lie in the rejection region.

Wooldridge (2013), Chapter 4. 24


Multiple Regression
Analysis: Inference
Critical value of
Confidence intervals two-sided test

Simple manipulation of the result in Theorem 4.2 implies that

Lower bound of the Upper bound of the Confidence level


Confidence interval Confidence interval

Interpretation of the confidence interval


The bounds of the interval are random
In repeated samples, the interval that is constructed in the above way
will cover the population regression coefficient in 95% of the cases

Wooldridge (2013), Chapter 4. 25


Multiple Regression
Analysis: Inference
Confidence intervals for typical confidence levels

Use rules of thumb

Relationship between confidence intervals and hypotheses tests

reject in favor of

Wooldridge (2013), Chapter 4. 26


Multiple Regression
Analysis: Inference
Example: Model of firms‘ R&D expenditures

Spending on R&D Annual sales Profits as percentage of sales

The effect of sales on R&D is relatively precisely estimated This effect is imprecisely estimated as the in-
as the interval is narrow. Moreover, „the effect is significantly terval is very wide. „It is not statistically
different from zero“ because zero is outside the interval. significant“ because zero lies in the interval.

Wooldridge (2013), Chapter 4. 27


Multiple Regression
Analysis: Inference
Testing hypotheses about a linear combination of parameters
Example: Return to education at 2 year vs. at 4 year colleges
Years of education Years of education
at 2 year colleges at 4 year colleges

Test against .

A possible test statistic would be:


The difference between the estimates is normalized by the estimated
standard deviation of the difference. The null hypothesis would have
to be rejected if the statistic is "too negative" to believe that the true
difference between the parameters is equal to zero.

Wooldridge (2013), Chapter 4. 28


Multiple Regression
Analysis: Inference
Impossible to compute with standard regression output because

Usually not available in regression output


Alternative method

Define and test against .

Insert into original regression a new regressor (= total years of college)

Wooldridge (2013), Chapter 4. 29


Multiple Regression
Analysis: Inference
Total years of college
Estimation results

Hypothesis is rejected at 10%


level but not at 5% level

This method works always for single linear hypotheses

Wooldridge (2013), Chapter 4. 30


Multiple Regression
Analysis: Inference
Testing multiple linear restrictions: The F-test
Testing exclusion restrictions
Salary of major lea- Years in Average number of
gue base ball player the league games per year

Batting average Home runs per year Runs batted in per year

against

Test whether performance measures have no effect/can be exluded from regression.

Wooldridge (2013), Chapter 4. 31


Multiple Regression
Analysis: Inference
Estimation of the unrestricted model

None of these variabels is statistically significant when tested individually

Idea: How would the model fit be if these variables were dropped from the regression?

Wooldridge (2013), Chapter 4. 32


Multiple Regression
Analysis: Inference
Estimation of the restricted model

The sum of squared residuals necessarily increases, but is the increase „large enough“?

Test statistic Number of restrictions

The relative increase of the sum of


squared residuals when going from
H1 to H0 follows a F-distribution (if
the null hypothesis H0 is correct)

Wooldridge (2013), Chapter 4. 33


Multiple Regression
Analysis: Inference
Rejection rule (Figure 4.7)

A F-distributed variable only takes on positive


values. This corresponds to the fact that the
sum of squared residuals can only increase if
one moves from H1 to H0.

Choose the critical value so that the null hypo-


thesis is rejected in, for example, 5% of the
cases, although it is true.

Wooldridge (2013), Chapter 4. 34


Multiple Regression
Analysis: Inference
Test decision in example Number of restrictions to be tested

Degrees of freedom in
the unrestricted model

The null hypothesis is rejected


(even at very small significance
levels).

Discussion
The three variables are "jointly significant"
They were not significant when tested individually
The likely reason is multicollinearity between them
Wooldridge (2013), Chapter 4. 35
Multiple Regression
Analysis: Inference
Test of overall significance of a regression

The null hypothesis states that the explanatory


variables are not useful at all in explaining the
dependent variable
Restricted model
(regression on constant)

The test of overall significance / predictive power is reported in


most regression packages; the null hypothesis is usually rejected

Wooldridge (2013), Chapter 4. 36


Multiple Regression
Analysis: Inference
Testing general linear restrictions with the F-test
Example: Test whether house price assessments are rational
The assessed housing value Size of lot
Actual house price
(before the house was sold) (in feet)

Square footage Number of bedrooms

In addition, other known factors should


not influence the price once the assessed
value has been controlled for.
If house price assessments are rational, a 1% change in the
assessment should be associated with a 1% change in price.

Wooldridge (2013), Chapter 4. 37


Multiple Regression
Analysis: Inference
Unrestricted regression

The restricted model is actually a


Restricted regression
regression of [y-x1] on a constant

Test statistic

cannot be rejected

Wooldridge (2013), Chapter 4. 38


Multiple Regression
Analysis: Inference
Regression output for the unrestricted regression

When tested individually,


there is also no evidence
against the rationality of
house price assessments

The F-test works for general multiple linear hypotheses


For all tests and confidence intervals, validity of assumptions
MLR.1 – MLR.6 has been assumed. Tests may be invalid otherwise.

Wooldridge (2013), Chapter 4. 39


Multiple Regression
Analysis: OLS Asymptotics

Chapter 5

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e
Multiple Regression
Analysis: OLS Asymptotics
So far we focused on properties of OLS that hold for any sample
Properties of OLS that hold for any sample/sample size
Expected values/unbiasedness under MLR.1 – MLR.4
Variance formulas under MLR.1 – MLR.5
Gauss-Markov Theorem under MLR.1 – MLR.5
Exact sampling distributions/tests under MLR.1 – MLR.6

Properties of OLS that hold in large samples


Without assuming nor-
Consistency under MLR.1 – MLR.4 mality of the error term!

Asymptotic normality/tests under MLR.1 – MLR.5

Wooldridge (2013), Chapter 5. 2


Multiple Regression
Analysis: OLS Asymptotics
Consistency

An estimator is consistent for a population parameter if

for arbitrary and .

Alternative notation:
The estimate converges in proba-
bility to the true population value
Interpretation:
Consistency means that the probability that the estimate is arbitrari-
ly close to the true population value can be made arbitrarily high by
increasing the sample size
Consistency is a minimum requirement for sensible estimators

Wooldridge (2013), Chapter 5. 3


Multiple Regression
Analysis: OLS Asymptotics
Theorem 5.1 (Consistency of OLS)

Special case of simple regression model

One can see that the slope estimate is consistent


Assumption MLR.4‘
if the explanatory variable is exogenous, i.e. un-
correlated with the error term.

All explanatory variables must be uncorrelated with the


error term. This assumption is weaker than the zero
conditional mean assumption MLR.4.

Wooldridge (2013), Chapter 5. 4


Multiple Regression
Analysis: OLS Asymptotics
For consistency of OLS, only the weaker MLR.4‘ is needed
Asymptotic analog of omitted variable bias

True model

Misspecified
model

Bias

There is no omitted variable bias if the omitted variable is


irrelevant or uncorrelated with the included variable

Wooldridge (2013), Chapter 5. 5


Multiple Regression
Analysis: OLS Asymptotics
Asymptotic normality and large sample inference
In practice, the normality assumption MLR.6 is often questionable
If MLR.6 does not hold, the results of t- or F-tests may be wrong
Fortunately, F- and t-tests still work if the sample size is large enough
Also, OLS estimates are normal in large samples even without MLR.6

Theorem 5.2 (Asymptotic normality of OLS)

Under assumptions MLR.1 – MLR.5:


In large samples, the
standardized estimates also
are normally distributed

Wooldridge (2013), Chapter 5. 6


Multiple Regression
Analysis: OLS Asymptotics
Practical consequences
In large samples, the t-distribution is close to the N(0,1) distribution
As a consequence, t-tests are valid in large samples without MLR.6
The same is true for confidence intervals and F-tests
Important: MLR.1 – MLR.5 are still necessary, esp. homoscedasticity

Asymptotic analysis of the OLS sampling errors

Converges to

Converges to Converges to a fixed


number

Wooldridge (2013), Chapter 5. 7


Multiple Regression
Analysis: OLS Asymptotics
Asymptotic analysis of the OLS sampling errors (cont.)

shrinks with the rate

shrinks with the rate

This is why large samples are better


Example: Standard errors in a birth weight equation

Use only the first half of observations

Wooldridge (2013), Chapter 5. 8


Multiple Regression
Analysis: Further Issues

Chapter 6

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e
Multiple Regression Analysis:
Further Issues

Models with quadratics and higher polynomials


Interaction terms
Adjusted R-squared

Wooldridge (2013), Chapter 6. 2


Multiple Regression Analysis:
Further Issues
Using quadratic functional forms
Example: Wage equation Concave experience profile

The first year of experience increases


the wage by some .30$, the second
Marginal effect of experience year by .298-2(.0061)(1) = .29$ etc.

Wooldridge (2013), Chapter 6. 3


Multiple Regression Analysis:
Further Issues
Wage maximum with respect to work experience

Does this mean the return to experience


becomes negative after 24.4 years?

Not necessarily. It depends on how many


observations in the sample lie right of the
turnaround point.

In the given example, these are about 28%


of the observations. There may be a speci-
fication problem (e.g. omitted variables).

Wooldridge (2013), Chapter 6. 4


Multiple Regression Analysis:
Further Issues Nitrogen oxide in air, distance from em-
ployment centers, student/teacher ratio

Example: Effects of pollution on housing prices

Does this mean that, at a low number of rooms,


more rooms are associated with lower prices?

Wooldridge (2013), Chapter 6. 5


Multiple Regression Analysis:
Further Issues
Calculation of the turnaround point

Turnaround point:

This area can be ignored as


it concerns only 1% of the
observations.

Increase rooms from 5 to 6:

Increase rooms from 6 to 7:

Wooldridge (2013), Chapter 6. 6


Multiple Regression Analysis:
Further Issues
Other possibilities: 1)Elasticities

2) Higher polynomials

Wooldridge (2013), Chapter 6. 7


Multiple Regression Analysis:
Further Issues
Models with interaction terms

Interaction term

The effect of the number


of bedrooms depends on
the level of square footage

Interaction effects complicate interpretation of parameters

Effect of number of bedrooms, but for a square footage of zero

Wooldridge (2013), Chapter 6. 8


Multiple Regression Analysis:
Further Issues
Reparametrization of interaction effects Population means; may be
replaced by sample means

Effect of x2 if all variables take on their mean values

Advantages of reparametrization
Easy interpretation of all parameters
Standard errors for partial effects at the mean values available
If necessary, interaction may be centered at other interesting values

Wooldridge (2013), Chapter 6. 9


Multiple Regression Analysis:
Further Issues
More on goodness-of-fit and selection of regressors
General remarks on R-squared
A high R-squared does not imply that there is a causal interpretation
A low R-squared does not preclude precise estimation of partial effects
Adjusted R-squared
What is the ordinary R-squared supposed to measure?

is an estimate for

Population R-squared

Wooldridge (2013), Chapter 6. 10


Multiple Regression Analysis:
Further Issues
Correct degrees of freedom of
Adjusted R-squared (cont.) nominator and denominator

A better estimate taking into account degrees of freedom would be

The adjusted R-squared imposes a penalty for adding new regressors


The adjusted R-squared increases if, and only if, the t-statistic of a
newly added regressor is greater than one in absolute value
Relationship between R-squared and adjusted R-squared

The adjusted R-squared


may even get negative

Wooldridge (2013), Chapter 6. 11


Multiple Regression Analysis:
Further Issues
Using adjusted R-squared to choose between nonnested models
Models are nonnested if neither model is a special case of the other

A comparison between the R-squared of both models would be unfair


to the first model because the first model contains fewer parameters
In the given example, even after adjusting for the difference in
degrees of freedom, the quadratic model is preferred

Wooldridge (2013), Chapter 6. 12


Multiple Regression Analysis:
Further Issues
Comparing models with different dependent variables
R-squared or adjusted R-squared must not be used to compare models
which differ in their definition of the dependent variable
Example: CEO compensation and firm performance

There is much
less variation
in log(salary)
that needs to
be explained
than in salary

Wooldridge (2013), Chapter 6. 13


Multiple Regression Analysis:
Further Issues
Controlling for too many factors in regression analysis
In some cases, certain variables should not be held fixed
In a regression of traffic fatalities on state beer taxes (and other
factors) one should not directly control for beer consumption
In a regression of family health expenditures on pesticide usage
among farmers one should not control for doctor visits
Different regressions may serve different purposes
In a regression of house prices on house characteristics, one would
only include price assessments if the purpose of the regression is to
study their validity; otherwise one would not include them

Wooldridge (2013), Chapter 6. 14


Multiple Regression Analysis:
Further Issues
Adding regressors to reduce the error variance

Adding regressors may excarcerbate multicollinearity problems

On the other hand, adding regressors reduces the error variance

Variables that are uncorrelated with other regressors should be added


because they reduce error variance without increasing multicollinearity

However, such uncorrelated variables may be hard to find

Example: Individual beer consumption and beer prices

Including individual characteristics in a regression of beer consumption


on beer prices leads to more precise estimates of the price elasticity

Wooldridge (2013), Chapter 6. 15


Multiple Regression Analysis
with Qualitative Information

Chapter 7

Wooldridge: Introductory Econometrics:


A Modern Approach, 5e

Wooldridge (2013), Chapter 7. 1


Multiple Regression Analysis:
Qualitative Information
Qualitative Information
Examples: gender, race, industry, region, rating grade, …
A way to incorporate qualitative information is to use dummy variables
They may appear as the dependent or as independent variables

A single dummy independent variable

= the wage gain/loss if the person Dummy variable:


is a woman rather than a man =1 if the person is a woman
(holding other things fixed) =0 if the person is man
Multiple Regression Analysis:
Qualitative Information
Graphical Illustration

Alternative interpretation of coefficient:

i.e. the difference in mean wage between


men and women with the same level of
education.

Intercept shift
Multiple Regression Analysis:
Qualitative Information
This model cannot be estimated (perfect collinearity)
Dummy variable trap

When using dummy variables, one category always has to be omitted:

The base category are men

The base category are women

Alternatively, one could omit the intercept: Disadvantages:


1) More difficult to test for diffe-
rences between the parameters
2) R-squared formula only valid
if regression contains intercept
Multiple Regression Analysis:
Qualitative Information
Estimated wage equation with intercept shift

Holding education, experience,


and tenure fixed, women earn
1.81$ less per hour than men

Does that mean that women are discriminated against?


Not necessarily. Being female may be correlated with other produc-
tivity characteristics that have not been controlled for.
Multiple Regression Analysis:
Qualitative Information
Comparing means of subpopulations described by dummies

Not holding other factors constant, women


earn 2.51$ per hour less than men, i.e. the
difference between the mean wage of men
and that of women is 2.51$.

Discussion
t-ratios / tests are computed in the same way
The wage difference between men and women is larger if no other
things are controlled for; i.e. part of the difference is due to differ-
ences in education, experience and tenure between men and women
Multiple Regression Analysis:
Qualitative Information
Further example: Effects of training grants on hours of training

Hours training per employee Dummy indicating whether firm received training grant

This is an example of program evaluation


Treatment group (= grant receivers) vs. control group (= no grant)
Is the effect of treatment on the outcome of interest causal?
Multiple Regression Analysis:
Qualitative Information
Using dummy explanatory variables in equations for log(y)

Dummy indicating
whether house is of
colonial style

As the dummy for colonial


style changes from 0 to 1,
the house price increases
by 5.4 percentage points
Multiple Regression Analysis:
Qualitative Information
Using dummy variables for multiple categories
1) Define membership in each category by a dummy variable
2) Leave out one category (which becomes the base category)

Holding other things fixed, married


women earn 19.8% less than single
men (= the base category)
Multiple Regression Analysis:
Qualitative Information
Incorporating ordinal information using dummy variables
Example: City credit ratings and municipal bond interest rates

Municipal bond rate Credit rating from 0-4 (0=worst, 4=best)

This specification would probably not be appropriate as the credit rating only contains
ordinal information. A better way to incorporate this information is to define dummies:

Dummies indicating whether the particular rating applies, e.g. CR1=1 if CR=1 and CR1=0
otherwise. All effects are measured in comparison to the worst rating (= base category).
Multiple Regression Analysis:
Qualitative Information
Interactions involving dummy variables Interaction term
Allowing for different slopes

= intercept men = slope men

= intercept women = slope women

Interesting hypotheses

The return to education is the The whole wage equation is


same for men and women the same for men and women
Multiple Regression Analysis:
Qualitative Information
Graphical illustration

Interacting both the intercept and


the slope with the female dummy
enables one to model completely
independent wage equations for
men and women
Multiple Regression Analysis:
Qualitative Information
Estimated wage equation with interaction term

Does this mean that there is no significant evidence of


No evidence against hypothesis that lower pay for women at the same levels of educ, exper,
the return to education is the same and tenure? No: this is only the effect for educ = 0. To
for men and women answer the question one has to recenter the interaction
term, e.g. around educ = 12.5 (= average education).
Multiple Regression Analysis:
Qualitative Information
Testing for differences in regression functions across groups
Unrestricted model (contains full set of interactions)

College grade point average Standardized aptitude test score High school rank percentile

Total hours spent


Restricted model (same regression for both groups) in college courses
Multiple Regression Analysis:
Qualitative Information
Null hypothesis All interaction effects are zero, i.e.
the same regression coefficients
apply to men and women

Estimation of the unrestricted model

Tested individually,
the hypothesis that
the interaction effects
are zero cannot be
rejected
Multiple Regression Analysis:
Qualitative Information
Null hypothesis is rejected
Joint test with F-statistic

Alternative way to compute F-statistic in the given case


Run separate regressions for men and for women; the unrestricted
SSR is given by the sum of the SSR of these two regressions
Run regression for the restricted model and store SSR
If the test is computed in this way it is called the Chow-Test
Important: Test assumes a constant error variance accross groups
Multiple Regression Analysis:
Qualitative Information
A Binary dependent variable: the linear probability model
Linear regression when the dependent variable is binary

If the dependent variable only


takes on the values 1 and 0

Linear probability
model (LPM)

In the linear probability model, the coefficients


describe the effect of the explanatory variables
on the probability that y=1
Multiple Regression Analysis:
Qualitative Information
Example: Labor force participation of married women

=1 if in labor force, =0 otherwise Non-wife income (in thousand dollars per year)

If the number of kids under six


years increases by one, the pro-
probability that the woman
works falls by 26.2%

Large standard error (but wait …)


Multiple Regression Analysis:
Qualitative Information
Example: Female labor participation of married women (cont.)

Graph for nwifeinc=50, exper=5,


age=30, kindslt6=1, kidsge6=0

The maximum level of education in


the sample is educ=17. For the gi-
ven case, this leads to a predicted
probability to be in the labor force
of about 50%.

Negative predicted probability but


no problem because no woman in
the sample has educ < 5.
Multiple Regression Analysis:
Qualitative Information
Disadvantages of the linear probability model
Predicted probabilities may be larger than one or smaller than zero
Marginal probability effects sometimes logically impossible
The linear probability model is necessarily heteroskedastic

Variance of Ber-
noulli variable

Heterosceasticity consistent standard errors need to be computed

Advantanges of the linear probability model


Easy estimation and interpretation
Estimated effects and predictions often reasonably good in practice
Multiple Regression Analysis:
Qualitative Information
More on policy analysis and program evaluation
Example: Effect of job training grants on worker productivity

Percentage of defective items =1 if firm received training grant, =0 otherwise

No apparent effect of
grant on productivity

Treatment group: grant reveivers, Control group: firms that received no grant

Grants were given on a first-come, first-served basis. This is not the same as giving them out
randomly. It might be the case that firms with less productive workers saw an opportunity to
improve productivity and applied first.
Multiple Regression Analysis:
Qualitative Information
Self-selection into treatment as a source for endogeneity
In the given and in related examples, the treatment status is probably
related to other characteristics that also influence the outcome
The reason is that subjects self-select themselves into treatment
depending on their individual characteristics and prospects
Experimental evaluation
In experiments, assignment to treatment is random
In this case, causal effects can be inferred using a simple regression

The dummy indicating whether or not there was


treatment is unrelated to other factors affecting
the outcome.
Multiple Regression Analysis:
Qualitative Information
Further example of an endogenuous dummy regressor
Are nonwhite customers discriminated against?

Dummy indicating whether Race dummy


loan was approved Credit rating

It is important to control for other characteristics that may be


important for loan approval (e.g. profession, unemployment)
Omitting important characteristics that are correlated with the non-
white dummy will produce spurious evidence for discriminiation

You might also like