Introduction to Econometrics
I.1 What Is Econometrics?
Econometrics is the application of statistical methods to economic
data in order to test hypotheses and estimate relationships between
economic variables.
I.2 Why a Separate Discipline?
Econometrics is a separate discipline from economics and statistics
because it requires a unique combination of economic theory,
mathematical modeling, and statistical analysis.
I.3 Methodology of Econometrics
The econometric methodology involves the following steps:
1. Statement of Theory or Hypothesis: formulate a theoretical
framework or hypothesis to be tested.
2. Specification of the Mathematical Model: translate the theoretical
framework into a mathematical model.
3. Specification of the Econometric Model: specify the econometric
model, including the variables to be included and the functional
form of the model.
4. Obtaining Data: collect and prepare the data for analysis.
5. Estimation of the Econometric Model: estimate the parameters of
the econometric model using statistical methods.
6. Hypothesis Testing: test hypotheses about the parameters of the
model.
7. Forecasting or Prediction: use the estimated model to forecast or
predict future values of the dependent variable.
8. Use of the Model for Control or Policy Purposes: use the
estimated model to inform policy decisions or control variables.
I.4 Types of Econometrics
There are several types of econometrics, including:
Theoretical Econometrics: focuses on the development of
econometric theory and models.
Applied Econometrics: focuses on the application of econometric
methods to real-world data.
Time Series Econometrics: focuses on the analysis of time series
data.
I.5 Mathematical and Statistical Prerequisites
Econometrics requires a strong foundation in mathematics and
statistics, including:
Calculus: used to optimize functions and derive mathematical
models.
Linear Algebra: used to manipulate matrices and vectors.
Probability Theory: used to understand random variables and
probability distributions.
Statistical Inference: used to make inferences about populations
based on sample data.
I.6 The Role of the Computer
Computers play a crucial role in econometrics, as they are used to:
Estimate econometric models: using statistical software packages.
Analyze and manipulate data: using spreadsheet software and
statistical software packages.
Simulate economic models: using specialized software packages.
Chapter 1: The Nature of Regression Analysis
1.1 Historical Origin of the Term Regression
The term "regression" was first coined by Sir Francis Galton in the late
19th century. Galton observed that the height of offspring tended to
"regress" or move closer to the average height of the population,
rather than exceeding the height of their parents.
1.2 The Modern Interpretation of Regression
In modern statistics, regression analysis refers to the process of
establishing a mathematical relationship between two or more
variables. This relationship is often used to predict the value of one
variable based on the value of another variable.
Examples
The relationship between the amount of rainfall and the yield of
crops.
The relationship between the price of a house and its size.
The relationship between the dosage of a drug and its effect on
blood pressure.
1.3 Statistical versus Deterministic Relationships
A statistical relationship is one that is based on probability, whereas a
deterministic relationship is one that is exact and predictable.
Regression analysis deals with statistical relationships.
1.4 Regression versus Causation
Regression analysis can identify relationships between variables, but
it cannot establish causation. In other words, just because two
variables are related, it does not mean that one variable causes the
other.
1.5 Regression versus Correlation
Regression analysis is concerned with establishing a cause-and-effect
relationship between variables, whereas correlation analysis is
concerned with measuring the strength and direction of the
relationship between variables.
1.6 Terminology and Notation
Dependent variable: the variable being predicted or explained.
Independent variable: the variable used to predict or explain the
dependent variable.
Regression equation: the mathematical equation that describes the
relationship between the dependent and independent variables.
1.7 The Nature and Sources of Data for Economic Analysis
Data is the raw material of statistical analysis. In economic analysis,
data can come from a variety of sources, including government
agencies, surveys, and experiments.
Types of Data
Time series data: data collected over time.
Cross-sectional data: data collected at a single point in time.
Panel data: data collected over time for multiple individuals or units.
The Sources of Data
Government agencies: such as the Bureau of Labor Statistics or the
Census Bureau.
Surveys: such as the Current Population Survey or the Consumer
Expenditure Survey.
Experiments: such as randomized controlled trials.
The Accuracy of Data
Data accuracy is crucial in statistical analysis. Errors in data can lead
to incorrect conclusions.
A Note on the Measurement Scales of Variables
Variables can be measured on different scales, including:
Nominal scale: a scale that categorizes variables without implying
any sort of order.
Ordinal scale: a scale that categorizes variables in a way that
implies a certain order or ranking.
Interval scale: a scale that measures variables in a way that implies
both order and exact differences between values.
Ratio scale: a scale that measures variables in a way that implies
order, exact differences, and a true zero point.
Chapter 2: Two-Variable Regression Analysis
2.1 A Hypothetical Example
Consider a simple example of how the price of a house (Y) is related to
its size (X). We can collect data on these two variables and analyze
their relationship.
2.2 The Concept of Population Regression Function (PRF)
The PRF is a mathematical function that describes the relationship
between two variables, X and Y, for the entire population.
Here's an example:
Example of Population Regression Function (PRF)
Suppose we want to study the relationship between the amount of
money spent on advertising (X) and the sales of a product (Y) for all
companies in a particular industry.
The population regression function (PRF) for this relationship might
be:
Y = β₁ + β₂X + ε
where:
Y = sales of the product
X = amount of money spent on advertising
β₁ = intercept or constant term
β₂ = slope coefficient
ε = stochastic disturbance term
For example, suppose the PRF for this industry is:
Y = 100 + 0.05X + ε
This PRF indicates that for every additional dollar spent on
advertising, sales increase by 5 cents, on average. The intercept term
(100) represents the average sales when no money is spent on
advertising.
Note that the PRF describes the relationship between X and Y for the
entire population of companies in the industry.
2.3 The Meaning of the Term Linear
In regression analysis, "linear" refers to two types of linearity:
Linearity in the Variables: the relationship between X and Y is linear.
Linearity in the Parameters: the parameters of the regression
equation are linear.
2.4 Stochastic Specification of PRF
The PRF can be specified as:
Y = β₁ + β₂X + ε
where ε is a stochastic disturbance term.
2.5 The Significance of the Stochastic Disturbance Term
The stochastic disturbance term (ε) represents the random factors that
affect the relationship between X and Y.
2.6 The Sample Regression Function (SRF)
The SRF is an estimate of the PRF based on a sample of data.
Here's an explanation:
2.6 The Sample Regression Function (SRF)
The Sample Regression Function (SRF) is an estimate of the
Population Regression Function (PRF) based on a sample of data.
Formula
The SRF can be written as:
Ŷ = b₀ + b₁X
where:
Ŷ = predicted value of Y
b₀ = estimated intercept term
b₁ = estimated slope coefficient
X = independent variable
Estimation of SRF
The SRF is estimated using the method of ordinary least squares
(OLS), which minimizes the sum of the squared errors between the
observed values of Y and the predicted values of Ŷ.
Example
Suppose we have a sample of data on the relationship between the
amount of money spent on advertising (X) and the sales of a product
(Y). The SRF might be:
Ŷ = 120 + 0.03X
This SRF indicates that for every additional dollar spent on
advertising, sales increase by 3 cents, on average. The intercept term
(120) represents the average sales when no money is spent on
advertising.
Note that the SRF is an estimate of the PRF, and the estimated
coefficients (b₀ and b₁) may not be exactly equal to the true population
parameters (β₁ and β₂).
2.7 Illustrative Examples
Consider the following examples:
The relationship between the amount of rainfall (X) and the yield of
crops (Y).
The relationship between the price of a house (X) and its size (Y).
Here are the examples:
2.7 Illustrative Examples
Example 1: Rainfall and Crop Yield
Suppose we want to study the relationship between the amount of rainfall
(X) and the yield of crops (Y). We collect data on these two variables for a
sample of farms.
The scatterplot of the data might look like this:
X (Rainfall) Y (Crop Yield)
10 50
20 70
30 90
40 110
50 130
The sample regression function (SRF) might be:
Ŷ = 20 + 2X
This SRF indicates that for every additional inch of rainfall, the crop yield
increases by 2 units, on average.
Example 2: House Price and Size
Suppose we want to study the relationship between the price of a house
(X) and its size (Y). We collect data on these two variables for a sample
of houses.
The scatterplot of the data might look like this:
X (House Price) Y (House Size)
100,000 1000
150,000 1200
200,000 1500
250,000 1800
300,000 2000
The sample regression function (SRF) might be:
Ŷ = 500 + 0.5X
This SRF indicates that for every additional dollar spent on a house, the
size of the house increases by 0.5 square feet, on average.
2.1 A Hypothetical Example
Suppose we want to study the relationship between the amount of
money spent on advertising (X) and the sales of a product (Y). We
collect data on these two variables for a sample of companies.
Here's a hypothetical example:
Company Advertising (X) Sales (Y)
A 100 1000
B 200 1200
Company Advertising (X) Sales (Y)
C 300 1500
D 400 1800
E 500 2000
We can see that as the amount of money spent on advertising
increases, the sales of the product also tend to increase.
2.2 The Concept of Population Regression Function (PRF)
The Population Regression Function (PRF) is a mathematical function
that describes the relationship between two variables, X and Y, for the
entire population.
In our example, the PRF might be:
Y = β₁ + β₂X + ε
where:
Y = sales of the product
X = amount of money spent on advertising
β₁ = intercept or constant term
β₂ = slope coefficient
ε = stochastic disturbance term
To find the values of the parameters β₁ and β₂, we can use the method of
ordinary least squares (OLS). Here are the steps:
Step 1: Calculate the means of X and Y
X̄ = (100 + 200 + 300 + 400 + 500) / 5 = 300
Ȳ = (1000 + 1200 + 1500 + 1800 + 2000) / 5 = 1500
Step 2: Calculate the deviations from the means
Company X Y X - X̄ Y - Ȳ
A 100 1000 -200 -500
B 200 1200 -100 -300
C 300 1500 0 0
D 400 1800 100 300
E 500 2000 200 500
Step 3: Calculate the slope coefficient β₂
β₂ = Σ[(Xᵢ - X̄ )(Yᵢ - Ȳ)] / Σ(Xᵢ - X̄ )²
= [(-200)(-500) + (-100)(-300) + (0)(0) + (100)(300) + (200)(500)] / [(-200)²
+ (-100)² + 0² + 100² + 200²]
= 250,000 / 50,000
=5
Step 4: Calculate the intercept term β₁
β₁ = Ȳ - β₂X̄
= 1500 - 5(300)
= 1500 - 1500
=0
Therefore, the values of the parameters are:
β₁ = 0
β₂ = 5
The equation of the population regression line is:
Y = 0 + 5X
Or simply:
Y = 5X
Here are the calculations:
Example: Relationship between Years of Experience and Salary
Suppose we want to study the relationship between the years of
experience (X) and the salary (Y) of a group of employees. We collect
data on these two variables for a sample of 5 employees.
Step 1: Calculate the means of X and Y
X̄ = (2 + 4 + 6 + 8 + 10) / 5 = 6
Ȳ = (40000 + 50000 + 60000 + 70000 + 80000) / 5 = 60000
Step 2: Calculate the deviations from the means
Employee X Y X - X̄ Y - Ȳ
A 2 40000 -4 -20000
B 4 50000 -2 -10000
C 6 60000 0 0
D 8 70000 2 10000
E 10 80000 4 20000
Step 3: Calculate the slope coefficient (b₁)
b₁ = Σ[(Xᵢ - X̄ )(Yᵢ - Ȳ)] / Σ(Xᵢ - X̄ )²
= [(-4)(-20000) + (-2)(-10000) + (0)(0) + (2)(10000) + (4)(20000)] / [(-4)² +
(-2)² + 0² + 2² + 4²]
= 120000 / 20
= 6000
Step 4: Calculate the intercept coefficient (b₀)
b₀ = Ȳ - b₁X̄
= 60000 - 6000(6)
= 60000 - 36000
= 24000
However, in the example I provided earlier, I mentioned that the intercept
term is $20000. This is because I made an error in my previous response.
The correct intercept term is indeed $24000.
Step 5: Write the equation of the sample regression line
Ŷ = b₀ + b₁X
= 24000 + 6000X
Here's a simplified version:
Chapter 3: Two-Variable Regression Model: The Problem of Estimation
3.1 The Method of Ordinary Least Squares (OLS)
The OLS method is used to estimate the parameters of a linear
regression model. The goal is to minimize the sum of the squared
errors between the observed values of Y and the predicted values of Ŷ.
3.2 The Classical Linear Regression Model: Assumptions
The classical linear regression model assumes:
1. Linearity: The relationship between X and Y is linear.
2. Constant variance: The variance of the error term is constant.
3. Independence: Each observation is independent of the others.
4. Normality: The error term is normally distributed.
5. No multicollinearity: The independent variables are not highly
correlated.
3.3 Precision or Standard Errors of Least-Squares Estimates
The standard error of an estimate measures its precision. A smaller
standard error indicates a more precise estimate.
3.4 Properties of Least-Squares Estimators: The Gauss-Markov
Theorem
The Gauss-Markov theorem states that the OLS estimator is the best
linear unbiased estimator (BLUE) of the true parameter value.
3.5 The Coefficient of Determination (r²): A Measure of "Goodness of
Fit"
The r² measures the proportion of the variation in Y that is explained
by X. A higher r² indicates a better fit.
Here's an explanation:
The Gauss-Markov Theorem
The Gauss-Markov theorem is a fundamental concept in statistics that
establishes the properties of the ordinary least squares (OLS)
estimator. The theorem states that the OLS estimator is the best linear
unbiased estimator (BLUE) of the true parameter value.
Properties of the OLS Estimator
The Gauss-Markov theorem establishes the following properties of the
OLS estimator:
1. Unbiasedness: The OLS estimator is unbiased, meaning that its
expected value is equal to the true parameter value.
2. Linearity: The OLS estimator is a linear function of the dependent
variable.
3. Minimum Variance: The OLS estimator has the minimum variance
among all unbiased linear estimators.
The Coefficient of Determination (r²)
The coefficient of determination, denoted by r², is a measure of the
goodness of fit of a regression model. It represents the proportion of
the variation in the dependent variable that is explained by the
independent variable(s).
Interpretation of r²
The value of r² ranges from 0 to 1, where:
r² = 0 indicates that the regression model does not explain any of
the variation in the dependent variable.
r² = 1 indicates that the regression model explains all of the
variation in the dependent variable.
0 < r² < 1 indicates that the regression model explains some, but
not all, of the variation in the dependent variable.
Example
Suppose we have a regression model that predicts the price of a
house based on its size. The r² value is 0.8. This means that 80% of the
variation in the price of the house is explained by its size.
Calculating r²
The r² value can be calculated using the following formula:
r² = 1 - (SSE / SST)
where:
SSE is the sum of the squared errors
SST is the total sum of squares
3.6 Numerical Examples
Suppose we have the following data:
X Y
1 2
2 4
3 6
4 8
5 10
Using OLS, we estimate the regression line as:
Ŷ = 1 + 1.8X
The r² is 0.98, indicating a good fit.
Another example:
X Y
2 3
4 5
6 7
8 9
10 11
Using OLS, we estimate the regression line as:
Ŷ = 1 + 0.9X
The r² is 0.97, indicating a good fit.
3.7 Illustrative Examples
The relationship between the amount of rainfall and the yield of
crops.
The relationship between the price of a house and its size.
The relationship between the number of hours studied and the
exam score.
The relationship between the amount of exercise and the weight
loss.
3.8 A Note on Monte Carlo Experiments
Monte Carlo experiments are a statistical technique used to study the
properties of estimators by simulating data. The basic idea is to
generate artificial data that mimics the characteristics of real data, and
then use this data to estimate the parameters of interest.
Steps involved in a Monte Carlo experiment:
1. Specify the model: Define the statistical model that you want to
study, including the parameters of interest.
2. Generate artificial data: Use a random number generator to
generate artificial data that mimics the characteristics of real data.
3. Estimate the parameters: Use the artificial data to estimate the
parameters of interest.
4. Repeat the process: Repeat steps 2-3 many times (e.g. 1000
times) to generate a large number of estimates.
5. Analyze the results: Analyze the distribution of the estimates to
study the properties of the estimator, such as its bias, variance,
and mean squared error.
Advantages of Monte Carlo experiments:
1. Flexibility: Monte Carlo experiments can be used to study a wide
range of statistical models and estimators.
2. Control: By generating artificial data, you have complete control
over the characteristics of the data.
3. Repeatability: Monte Carlo experiments can be repeated many
times to generate a large number of estimates.
Common applications of Monte Carlo experiments:
1. Evaluating estimator performance: Monte Carlo experiments can
be used to evaluate the performance of different estimators, such
as their bias, variance, and mean squared error.
2. Studying the effects of outliers: Monte Carlo experiments can be
used to study the effects of outliers on estimator performance.
3. Investigating the properties of statistical tests: Monte Carlo
experiments can be used to investigate the properties of
statistical tests, such as their power and size.
3A.1 Derivation of Least-Squares Estimates
The least-squares estimates are derived by minimizing the sum of the
squared errors (SSE) between the observed values of Y and the
predicted values of Ŷ.
Step 1: Define the Sum of Squared Errors (SSE)
SSE = Σ(Yᵢ - Ŷᵢ)²
where:
Yᵢ is the observed value of Y
Ŷᵢ is the predicted value of Y
Step 2: Define the Predicted Value of Y
Ŷᵢ = β₀ + β₁Xᵢ
where:
β₀ is the intercept term
β₁ is the slope coefficient
Xᵢ is the value of the independent variable
Step 3: Substitute the Predicted Value of Y into the SSE Equation
SSE = Σ(Yᵢ - (β₀ + β₁Xᵢ))²
Step 4: Expand the SSE Equation
SSE = Σ(Yᵢ² - 2β₀Yᵢ - 2β₁XᵢYᵢ + β₀² + 2β₀β₁Xᵢ + β₁²Xᵢ²)
Step 5: Minimize the SSE Equation with Respect to β₀ and β₁
To minimize the SSE equation, we take the partial derivatives of SSE
with respect to β₀ and β₁, and set them equal to zero:
∂SSE/∂β₀ = -2Σ(Yᵢ - β₀ - β₁Xᵢ) = 0
∂SSE/∂β₁ = -2Σ(Xᵢ(Yᵢ - β₀ - β₁Xᵢ)) = 0
Step 6: Solve for β₀ and β₁
Solving the above equations simultaneously, we get:
β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)
β₀ = Ȳ - β₁X̄
where:
X̄ is the mean of X
Ȳ is the mean of Y
These are the least-squares estimates of β₀ and β₁.
Here's an explanation:
3A.2 Linearity and Unbiasedness Properties of Least-Squares
Estimators
The least-squares estimators of β₀ and β₁ have two important
properties:
1. Linearity: The least-squares estimators are linear functions of the
dependent variable Y.
2. Unbiasedness: The least-squares estimators are unbiased,
meaning that their expected values are equal to the true
parameter values.
Linearity Property
The least-squares estimator of β₁ can be written as:
β̂₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)
This estimator is a linear function of Y, since it involves a linear
combination of the products of Xᵢ and Yᵢ.
Similarly, the least-squares estimator of β₀ can be written as:
β̂₀ = Ȳ - β̂₁X̄
This estimator is also a linear function of Y, since it involves a linear
combination of Ȳ and β̂₁X̄ .
Unbiasedness Property
To show that the least-squares estimators are unbiased, we need to
show that their expected values are equal to the true parameter values.
E(β̂₁) = E[Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²)]
= Σ(XᵢE(Yᵢ) - X̄ E(Ȳ)) / Σ(Xᵢ² - X̄ ²)
= β₁
Similarly, we can show that E(β̂₀) = β₀.
Therefore, the least-squares estimators of β₀ and β₁ are unbiased.
3A.3 Variances and Standard Errors of Least-Squares Estimators
The variances and standard errors of the least-squares estimators of
β₀ and β₁ are derived using the properties of the error term.
Assumptions
We assume that the error term εᵢ has the following properties:
1. Zero mean: E(εᵢ) = 0
2. Constant variance: Var(εᵢ) = σ²
3. Independence: εᵢ and εⱼ are independent for i ≠ j
4. Normality: εᵢ ~ N(0, σ²)
Variance of β̂₁
The variance of β̂₁ is given by:
Var(β̂₁) = σ² / Σ(Xᵢ² - X̄ ²)
where:
σ² is the variance of the error term
Xᵢ is the value of the independent variable
X̄ is the mean of the independent variable
Variance of β̂₀
The variance of β̂₀ is given by:
Var(β̂₀) = σ² * (1 / n + X̄ ² / Σ(Xᵢ² - X̄ ²))
where:
n is the sample size
X̄ is the mean of the independent variable
Standard Errors of β̂₁ and β̂₀
The standard errors of β̂₁ and β̂₀ are given by:
SE(β̂₁) = √Var(β̂₁) = σ / √Σ(Xᵢ² - X̄ ²)
SE(β̂₀) = √Var(β̂₀) = σ * √(1 / n + X̄ ² / Σ(Xᵢ² - X̄ ²))
where:
σ is the standard deviation of the error term
Note that the standard errors of β̂₁ and β̂₀ are used to construct
confidence intervals and perform hypothesis tests.
Here are the solutions:
Practice Problem 1:
Data:
X Y
2 4
4 6
6 8
8 10
X Y
10 12
Estimate the Regression Line using OLS:
First, calculate the means of X and Y:
X̄ = (2 + 4 + 6 + 8 + 10) / 5 = 6
Ȳ = (4 + 6 + 8 + 10 + 12) / 5 = 8
Next, calculate the slope coefficient (β₁) and the intercept term (β₀):
β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²) = 1
β₀ = Ȳ - β₁X̄ = 2
The estimated regression line is:
Ŷ = 2 + 1X
Calculate the r²:
First, calculate the total sum of squares (SST):
SST = Σ(Yᵢ - Ȳ)² = 40
Next, calculate the sum of squared errors (SSE):
SSE = Σ(Yᵢ - Ŷᵢ)² = 0
Finally, calculate the r²:
r² = 1 - (SSE / SST) = 1 - (0 / 40) = 1
The r² value of 1 indicates a perfect fit.
Practice Problem 2:
Here's a hypothetical dataset:
X Y
1 3
2 5
X Y
3 7
4 9
5 11
6 13
7 15
8 17
9 19
10 21
Using OLS, we can estimate the regression line as:
Ŷ = 1.9 + 1.9X
The r² value is:
r² = 0.98
This indicates that about 98% of the variation in Y is explained by X.
Here's a breakdown of the calculations:
Estimate the Regression Line:
First, calculate the means of X and Y:
X̄ = (1 + 2 + ... + 10) / 10 = 5.5
Ȳ = (3 + 5 + ... + 21) / 10 = 12.1
Next, calculate the slope coefficient (β₁) and the intercept term (β₀):
β₁ = Σ(XᵢYᵢ - X̄ Ȳ) / Σ(Xᵢ² - X̄ ²) = 1.9
β₀ = Ȳ - β₁X̄ = 1.9
The estimated regression line is:
Ŷ = 1.9 + 1.9X
Calculate the r²:
First, calculate the total sum of squares (SST):
SST = Σ(Yᵢ - Ȳ)² = 200
Next, calculate the sum of squared errors (SSE):
SSE = Σ(Yᵢ - Ŷᵢ)² = 4
Finally, calculate the r²:
r² = 1 - (SSE / SST) = 1 - (4 / 200) = 0.98
This indicates that about 98% of the variation in Y is explained by X.
3A.4 Covariance Between β̂₁ and β̂₂
The covariance between β̂₁ and β̂₂ is given by:
Cov(β̂₁, β̂₂) = -σ² * (ΣXᵢZᵢ) / (ΣXᵢ² * ΣZᵢ² - (ΣXᵢZᵢ)²)
where:
σ² is the variance of the error term
Xᵢ and Zᵢ are the values of the two independent variables
Σ denotes the summation over all observations
3A.5 The Least-Squares Estimator of σ²
The least-squares estimator of σ² is given by:
σ̂² = SSE / (n - k)
where:
SSE is the sum of squared errors
n is the sample size
k is the number of independent variables
Here's an explanation:
3A.6 Minimum-Variance Property of Least-Squares Estimators
The minimum-variance property of least-squares estimators states
that among all unbiased linear estimators, the least-squares
estimators have the smallest variance.
Definition of Minimum-Variance Property
An estimator β̂ is said to have the minimum-variance property if:
1. β̂ is an unbiased estimator of β
2. Var(β̂) ≤ Var(β̃) for any other unbiased linear estimator β̃
Proof of Minimum-Variance Property
To prove that the least-squares estimator β̂ has the minimum-variance
property, we can use the following steps:
1. Show that β̂ is an unbiased estimator of β
2. Show that Var(β̂) ≤ Var(β̃) for any other unbiased linear estimator
β̃
Gauss-Markov Theorem
The Gauss-Markov theorem provides a formal proof of the minimum-
variance property of least-squares estimators. The theorem states that
the least-squares estimator β̂ is the best linear unbiased estimator
(BLUE) of β, meaning that it has the smallest variance among all
unbiased linear estimators.
Implications of Minimum-Variance Property
The minimum-variance property of least-squares estimators has
several important implications:
1. Efficiency: Least-squares estimators are efficient, meaning that
they make the most use of the available data.
2. Reliability: Least-squares estimators are reliable, meaning that
they provide consistent estimates of the true parameter values.
3. Optimality: Least-squares estimators are optimal, meaning that
they are the best possible estimators among all unbiased linear
estimators.
3A.7 Consistency of Least-Squares Estimators
The consistency property of least-squares estimators states that as
the sample size n increases, the least-squares estimators β̂₁ and β̂₂
converge in probability to the true parameter values β₁ and β₂,
respectively.
Definition of Consistency
An estimator β̂ is said to be consistent if:
1. β̂ converges in probability to the true parameter value β as the
sample size n increases.
2. The probability of β̂ deviating from β by more than a small amount
ε tends to zero as n increases.
Proof of Consistency
To prove that the least-squares estimators are consistent, we can use
the following steps:
1. Show that the least-squares estimators are unbiased.
2. Show that the variance of the least-squares estimators tends to
zero as the sample size n increases.
Implications of Consistency
The consistency property of least-squares estimators has several
important implications:
1. Reliability: Least-squares estimators are reliable, meaning that
they provide consistent estimates of the true parameter values.
2. Accuracy: Least-squares estimators are accurate, meaning that
they converge to the true parameter values as the sample size
increases.
3. Large-Sample Properties: The consistency property ensures that
the least-squares estimators have desirable large-sample
properties, such as asymptotic normality and efficiency.
Asymptotic Properties
The consistency property of least-squares estimators also implies that
they have desirable asymptotic properties, such as:
1. Asymptotic Normality: The least-squares estimators are
asymptotically normally distributed.
2. Asymptotic Efficiency: The least-squares estimators are
asymptotically efficient, meaning that they achieve the lowest
possible variance among all unbiased estimators.
Chapter Solutions
Step 1 of 17
Consumer Price Index measures the weighted average of prices of consumer goods and services
purchased in an economy. Table 1.1 gives data on Consumer Price index of 7 countries during period
of 1980-2005, with 100 as the base of index during 1982-84.
Step 2 of 17
a.
Inflation rate is measure of rate of increase in price level in an economy over a period of time.
To find inflation rate of current year, subtract CPI of previous year from CPI of current year, divide the
difference by CPI of previous year. And multiply the result by 100.
CPI of country U in 1980 is 82.4 and CPI in 1981 is 90.0. Inflation rate of country U in 1981 is given
by,
Inflation rate of country U in 1981 is 10.32%.
Similarly, CPI of country G in 1994 is 131.1 and CPI in 1995 is 133.3. Inflation rate of country G in
1995 is given by,
Inflation rate of country G in 1995 is 1.68%.
Similarly, inflation rate of the 7 countries for each year is calculated in table 1.2
Step 3 of 17
Step 4 of 17
b.
Plot the inflation rate of the 7 countries for each year, using Table1.2. The vertical axis shows inflation
rate and horizontal axis shows time.
Step 5 of 17
c.
The graph 1.1, which shows inflation rate of the 7 countries, can be divided into 4 periods. Separate
conclusions can be drawn for each period.
From 1981 to 1986, the inflation rate of countries is generally declining. From 1987 to 1990, inflation
rate of countries is generally rising. From 1991 to 1994, inflation rate of countries is generally
declining. And from 1995 to 2005, inflation rate of countries is generally constant.
Step 6 of 17
d.
Standard deviation can be used to measure variability in inflation rate of each country over time. It
measures the variability of data of a group from mean value of the group.
Formula of standard deviation σ is,
Where n is number of observations
X bar is mean of data set
Calculate standard deviation for inflation rate of country U, using above formula.
Step 7 of 17
Standard deviation σ is,
Standard deviation of country U is 1.77
Step 8 of 17
Calculate standard deviation for inflation rate of country C.
Standard deviation σ is,
Standard deviation of country C is 2.79
Step 9 of 17
Calculate standard deviation for inflation rate of country J.
Step 10 of 17
Standard deviation σ is,
Standard deviation of country J is 1.45
Step 11 of 17
Calculate standard deviation for inflation rate of country F.
Standard deviation σ is,
Standard deviation of country F is 3.34
Step 12 of 17
Calculate standard deviation for inflation rate of country G.
Step 13 of 17
Standard deviation σ is,
Standard deviation of country G is 1.57
Step 14 of 17
Calculate standard deviation for inflation rate of country I.
Standard deviation σ is,
Standard deviation of country I is 4.48
Step 15 of 17
Calculate standard deviation for inflation rate of country B.
Step 16 of 17
Standard deviation σ is,
Standard deviation of country B is 2.60
Step 17 of 17
Standard deviation for country I is highest. Therefore, inflation rate of country I is most variable.