Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views29 pages

Metrics Topic1 Introduction

The document discusses the fundamentals of econometrics, focusing on how to determine causal relationships and the importance of data types in economic analysis. It uses Anna's decision between a job and graduate school to illustrate the complexities of income comparisons and the need for controlled experiments to isolate treatment effects. Additionally, it covers various data types, including experimental, observational, cross-sectional, panel, and time series data, emphasizing their roles in identifying causal effects and making forecasts.

Uploaded by

David NICE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views29 pages

Metrics Topic1 Introduction

The document discusses the fundamentals of econometrics, focusing on how to determine causal relationships and the importance of data types in economic analysis. It uses Anna's decision between a job and graduate school to illustrate the complexities of income comparisons and the need for controlled experiments to isolate treatment effects. Additionally, it covers various data types, including experimental, observational, cross-sectional, panel, and time series data, emphasizing their roles in identifying causal effects and making forecasts.

Uploaded by

David NICE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Econ 3334: Introduction to Econometrics

What is econometrics about?

• Anna is in her final year in her undergraduate study.


• Anna receives two offers:
– economic analyst at JP Morgan,
– master’s program in economics at NYU.
• Anna has to accept one offer and give up the other.
• Assume Anna bases her decision on prospect income.

2
What is econometrics about?
• Anna inpute “compare average income undergraduates
masters” in Google and found a lot of information.
• Example 1. Differential between average starting salaries for Class of 2020
graduates earning bachelor’s and master’s degrees. Source: Summer 2021 Salary
Survey, National Association of Colleges and Employers
AVERAGE AVERAGE
MAJOR BACHELOR’S MASTER’S % DIFFERENTIAL
SALARY SALARY
Biology $37,182 $69,353 86.5%
Communication disorders sciences $34,517 $58,890 70.6%

Business administration/management $54,392 $82,372 51.4%

Computer/information technology $59,032 $86,695 46.9%


Communication and media studies $42,345 $62,166 46.8%
Systems engineering $73,559 $106,640 45.0%
Registered nursing $58,626 $84,673 44.4%
Psychology $37,006 $52,786 42.6%
English language and literature $38,597 $54,102 40.2%
Social work $35,622 $48,711 36.7%

3
What is econometrics about?
• Should Anna be convinced that “going to grad school
does cause one to have higher income?”
• Other potential factors causing the income difference:
– Ability
– Work experience
– Parental education and income…
• Other issues to consider:
– Is the difference economically meaningful?
– What is the variance/variation within each group?
– Bill Gates, Mark Zuckerberg, Elon Musk…None of them
have a master’s degree.

4
What is econometrics about?
• To answer Anna, we want to compare the income between two
persons:
– Anna with a bachelor’s degree versus the same Anna with a
master’s degree.
• In reality, we only observe one choice made by Anna, say, Anna
with a bachelor’s degree. So the income of the “Anna with a
master’s degree” is not observed.
• The income of the “Anna with a master’s degree is called a
“potential outcome” for Anna, which is unobserved in reality.
• In reality, we observe the income of Jack with a master’s degree.
• However, the comparison between “Anna with a bachelor’s
degree” and “Jack with a master’s degree” is not fair:
– They differ in work experience, ability, family income…
5
What is econometrics about?
• A fair comparison requires “ceteris paribus”: other
things equal.
• A possible theoretical solution
– For a large number of UG students, randomly determine
who go to the graduate school and who go to work right after
graduation.
– Compare the two groups’ average income 10 years later.
– The only “systematic” difference between the two groups is
schooling. All other personal traits are similar.
• This is guaranteed by (1) a large number of people and (2) random
assignment of group membership.

– So the observed income difference can only be due to the


difference in schooling.
6
How to define a causal effect?
• Definition 1: the difference in outcomes from a
“ceteris paribus” comparison: other things equal.
• Definition 2: the difference in potential outcomes for
the same individual. (We will formalize this idea
later.)
• Definition 3: the difference in outcomes due to an
intervention.
– Such an intervention is also called a treatment.
– The causal effect is also called the treatment effect.

7
Examples of a causal effect?
• Does minimum wage increase cause higher unemployment? The intervention is
the policy increase of the minimum wage. The outcome is the unemployment
rate.
• What are the effects of job training on workers’ productivity? The intervention is
the job training program. The outcome is worker’s productivity.
• Does a higher diploma lead to higher income? The intervention is to obtain a
higher diploma. The outcome is the income.
• If the Fed increases its policy rate by 50 basis points, how will inflation and
unemployment respond during the next three months? The intervention is the
policy increase of federal fund rates. The outcomes are a sequence of
macroeconomic variables, such as inflation after two months and unemployment
after three months, etc.
• Does smoking during pregnancy cause unhealthy babies? The intervention is the
action of smoking during pregnancy. The outcome is the health of babies.
• Does vaccination provide protection against Covid-19? The intervention is taking
the vaccine. The outcome can be the infection rate.

8
Hypothetical intervention
• An intervention is not always feasible in practice. To
understand the causal effect in such cases, we will
conduct a “thought experiment” by using a
“hypothetical intervention”.
• For example, in the recent legal case against Harvard
about possible racial discrimination, we may think of
race as the intervention, while admission rates or
evaluation scores as the outcomes.
• Of course, we cannot switch on and off an applicant’s
race. However, we may still imagine what could
happen if the applicant is of a different race.

9
Randomized controlled experiments*
• A treatment is randomly assigned to a large number
of individuals.
– We flip a coin to decide whether one goes to graduate school.
• Those who receive the treatment belong to the
treatment group. The others are in the control group.
– The two groups are similar in all aspects, such as ability,
preference, except for the treatment.
• The observed difference in outcome is caused by the
treatment.
• However, it is usually either too costly or infeasible to
implement such “experiments” in economics.

10
* Also called randomized controlled trial (RCT)
Experimental data in medical trials
• Pfizer’s experimental data on Paxnovid; submitted to
FDA for emergency approval.
• N = 1219.
• Treatment group (Paxlovid), control group (placebo)
• Double-blinded experiment: neither the patients nor the
researchers know the group membership of the patients
until the experiment is over.
• Pills were given to patients within 5 days of Covid
infection; results observed within 28 days of medication.
Treatment (n=607) Control (n=612)
Hospitalized 6 (1.0%) 41 (6.7%)
Deaths 0 10 (1.6%)

11
Effectiveness against Hosp. is defined as (6.7-1.0)/6.7 = 85%.
Observational data in economics

12
The observational data
• In the experimental data, the treatment is randomly
assigned.
• In the observational data, the treatment is not
randomly assigned to individuals.
– Those who attend graduate school might prefer an academic
job to working in the private sector.
– Those who attend graduate school might have private
information about the job market outcomes of graduate
students.
– Those who choose not to attend graduate school might be
more ambitious about private sector jobs and have private
information about the value of work experience.

13
The observational data
• In the observational data, the control group and the
treatment group differ in lots of factors other than the
treatment.
• Such factors, if also affecting the outcome, are called
“confounding factors”.
– Longer work experience causes higher income.
– Academic jobs tend to pay lower than industry jobs.
– The graduate training in big data helps you to land a high-
pay job.
• Econ 3334 aims to isolate the effects of such
confounding factors from the treatment effects, using
observational data.
14
Two themes of econometrics

• Identification of cause and effects in economics.


– Randomized controlled experiments/trials (RCT)
– Regression
• Prediction/forecast of economic outcomes.
– Pattern search using regression
– Pattern search using machine learning

15
Causality vs forecast
• To “forecast” whether it is raining or not, just look
outside your window and see if pedestrians are using
umbrellas.
– Using an umbrella does not cause it to rain.
• William Phillips in 1958 noted that low
unemployment rate came hand-in-hand with high
inflation. Thus it seems we may forecast the inflation
to be high when we observe low unemployment rate,
or the other way around.
– The Phillips curve (1958), Samuelson and Solow (1960)
– New evidence in recent decades.

16
The Phillips curve: time series plot

17
The Phillips curve: 1948Q1-1960Q4
(scatterplot)
Unemployment versus Inflation (with least squares fit)
7.5
Y = 4.86 - 0.411X

6.5

6
Unemployment

5.5

4.5

3.5

2.5
-1 0 1 2 3 4
Inflation

18
The Phillips curve: 1961Q1-1980Q4

Unemployment versus Inflation (with least squares fit)


9
Y = 5.10 + 0.359X

8
Unemployment

3
0 0.5 1 1.5 2 2.5 3 3.5 4
Inflation

19
The Phillips curve: 1981Q1-2020Q4

Actual and fitted Unemployment


14
actual
fitted
13

12

11
Unemployment

10

3
-2 -1 0 1 2
Inflation

20
The scatterplot

• Scatterplot is a diagram illustration of the data on two


variables.
– Scatterplot provides a visual display of the relation
between two variables.
– Scatterplot only shows “correlation”, but not a causal
link.
• A causal link has a direction.
• The correlation has no direction.
• The horizontal and vertical axis of the scatterplot
are interchangeable.

21
Phillips Curve: explanations
• 1948Q2-1960Q4: demand shocks caused opposite
movement of unemployment and inflation
– A surge in demand  firms hire more workers  products
are more expensive due to higher demand.
• 1961Q1-1980Q4: supply shocks (oil markets
disruptions) cause unemployment and inflation to
move in the same direction.
– A surge in energy prices  firms reduce production, hire
fewer workers to reduce cost  products are more expensive
due to higher cost.
• 1981Q1-2020Q4: central bank became more determined to
keep inflation stable, even at the cost of high unemployment
rates. Inflation became less responsive to economic shocks.
22
Causality and forecast
• X forecasts Y does not imply X causes Y
• Understanding the causal link between X and Y is
crucial for making forecast “out-of-sample”.
– In the case of inflation and unemployment, it is usually a
common factor Z that drives X and Y together.
– Understanding the causal link among Z, X, Y helps us to make
forecast about X and Y.
– Knowing the correlation between X and Y only helps “in-
sample” forecast, but could be misleading for “out-of-sample”
forecast.
• “out-of-sample” means either outside the time frame of the data or
predictor’s value being outside the coverage of the data.

23
Data sources
• Experimental data: data from randomized controlled
experiment
• Observational data (non-experimental data): data
obtained by observing actual behavior outside an
experimental setting, such as surveys and historical
records.
• We will use the following notions while referring to a
data set:
– Random variable: a variable that can take multiple possible
values.
– Observation: the observed value of a random variable

24
An example of a data set
• For example, we have 5 observations for two random
variables “wage” ($/hour) and “experience” (years) in
year 2012:

• In this example, each observation denotes a person in a given


period. For each person, we observe his/her wage and experience in
2012.

25
Data types/formats
• Cross-sectional data: a sample of individuals within a
specific time period; timing differences can be ignored.
– The individual can be a person, a firm, a country, a product, etc.
• Repeated cross-sectional or pooled cross-sectional data
– Combine samples from different periods. Each period may
include different individuals.
– The number of observations can be different across periods.

26
Data types
• Panel data: (also called longitudinal data)
– Observations for the same set of individuals over multiple periods.
– The following data consist of three variables for 48 states observed
over 11 years. There are 48*11 = 528 observations in total.

27
Alabama in 1985 and Alabama in 1986 are regarded as two different observations.
Data types
• Time series data
– Observations on one or more variables over time.
– Usually come with a given frequency, such as annually,
quarterly, monthly, weekly, daily, etc.
– Example: annual GDP, unemployment rate, stock prices in Hong
Kong between 2010 and 2021.

28
Summary
Causality, forecast, experiment,
scatterplot, data sources and types

You might also like