0% found this document useful (0 votes)

5 views11 pages

Introduction To Regression Chapter 2

The document provides an overview of the Classical Linear Regression Model (CLRM) and the Ordinary Least Squares (OLS) method, emphasizing their importance in quantifying relationships between variables for decision-making. It discusses the assumptions necessary for OLS estimators to be unbiased and efficient, as well as the interpretation of the coefficient of determination (R²) and the limitations of using R² in regression analysis. Additionally, it outlines the process of hypothesis testing and confidence intervals in OLS, including the significance of coefficients and the use of p-values.

Uploaded by

laibaadeelnasir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Introduction To Regression Chapter 2

Uploaded by

laibaadeelnasir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Introduction to Regression: The Classical Linear Regression

Model (CLRM)
Why Do We Do Regressions?

Regression analysis is a fundamental econometric method used to:

 Reduce uncertainty by quantifying relationships between variables.

 Support planning and decision-making, especially in economics, finance, and business.

However, building a good model requires careful effort:

 Including too many variables may introduce irrelevant information (overfitting,

unnecessary complexity).
 Including too few variables can lead to misspecification by omitting important
influences or using the wrong functional form.

The Classical Linear Regression Model (CLRM)

CLRM is used to understand the relationship between two or more variables, typically
involving:

 A dependent variable (Y) – the one we want to explain or predict.

 An independent variable (X) – the one used to explain changes in Y.

In its simplest form (with just one X), the model assumes a linear relationship between X and
Y:

E(Yt)=a+βXt

 E(Yt): Expected value of Y at time t.

 a: Intercept (value of Y when X = 0).
 β: Slope (change in Y due to a one-unit change in X).
 Xt: Value of independent variable at time t.

However, real-world data seldom follows the expected relationship exactly. So we add a
disturbance term (uₜ) to capture the difference between actual and expected values:

Yt=a+βXt+ut

Why Does the Disturbance Term utu_t Exist?

Several reasons:

1. Omitted Variables: Not all relevant factors affecting Y may be included.

2. Aggregation: Simplifying many variables into one may leave residual variation.
3. Model Misspecification: The model structure may be incorrect (e.g., using XtX_t instead
of Xt−1X_{t-1}).
4. Functional Misspecification: The true relationship might be non-linear.
5. Measurement Error: Mistakes in data collection for Y or X.

Can We Estimate the Population Regression Function?

 The true (population) regression is unknown and unobservable.

 But we can estimate it using a sample of data.
 The first step is often to create a scatter plot of Y vs X.

Fitting a Line to the Data:

Several naïve methods to fit a line:

1. Drawing a line by eye.

2. Connecting the first and last data points.
3. Connecting the averages of early and late observations.

These are subjective and imprecise.

The Proper Method: Ordinary Least Squares (OLS)

OLS is the standard and statistically justified method for estimating the regression line. It:

 Minimizes the sum of squared residuals (the differences between actual and predicted
Y values).
 Provides estimators for a and β with desirable properties under the CLRM assumptions.

OLS is the focus of the next part of the discussion or chapter.

Ordinary Least Squares (OLS) Method of Estimation (No

Derivations)
1. Purpose of OLS

OLS is used to estimate the relationship between a dependent variable YY and an explanatory
(independent) variable XX using sample data. The goal is to find the line that best fits the data,
represented as:

Yt=a+βXt+ut

Since the population parameters aa and β\beta are unknown, we use sample data to estimate
them:

Y^t=a^+β^Xt

2. Why Use the OLS Method?

OLS works by minimizing the sum of the squared residuals — the differences between the
actual values and the predicted values. It’s the most popular estimation method because it has
desirable statistical properties:

 Unbiased: On average, it gives the correct parameter values.

 Efficient: It provides the most precise estimates under the classical assumptions.
 Consistent: Estimates improve as sample size increases.

Why Minimize Squared Residuals?

OLS has useful properties:

1. Eliminates sign issues: Squaring avoids positive and negative errors canceling each
other out.
2. Penalizes large errors more heavily: Squared terms give more weight to larger
deviations.
3. Produces efficient, unbiased estimates under the CLRM assumptions.

4. OLS Estimators

OLS provides estimates for:

 The slope (β^) Tells us how much Y changes when X increases by one unit.
 The intercept (a^): The predicted value of Y when X=0

Once estimated, the regression line can be used to:

 Interpret relationships
 Predict outcomes
 Assess model accuracy

5. Practical Use

To apply OLS:

1. Collect sample data for X and Y.

2. Use software (or formulas) to estimate a^ } and β^
3. Use the equation Y^=a^+β^X to make predictions or analyze the relationship.

The Assumptions of the Classical Linear Regression Model

(CLRM)
To ensure that OLS estimators are unbiased, consistent, and efficient, the following
assumptions must hold:

1. Linearity

The model is linear in parameters:

Yt=α+βXt+ut

The relationship between Y and X is assumed to be linear in form.

2. Variation in Xt

There must be variability in the independent variable XX. If all XX values are the same, we
cannot estimate a relationship.

3. Non-Stochastic Xt

The values of XX are fixed in repeated samples and not random. This means:

 Xt is not influenced by the random error ut

 Xt and ut are uncorrelated

4. Zero Mean of the Disturbance Term

The expected value of the error term is zero:

E(ut)=0

This ensures that the regression line reflects the average relationship between XX and YY.

5. Homoskedasticity

The error terms have constant variance:

Var(ut) = σ2

This means the spread of the errors is the same across all values of XX.

6. No Serial Correlation

The error terms are not correlated with each other:

Cov(ut,us)=0 for t≠s

This assumption is particularly important for time series data to avoid biased or inefficient
estimates.

7. Normality of the Error Terms

The disturbances are normally distributed:

ut∼N(0,σ2)

This assumption is especially important for conducting hypothesis tests and constructing
confidence intervals.

8. Sufficient Sample Size and No Perfect Multicollinearity

 The number of observations must exceed the number of estimated parameters.

 No exact linear relationships among explanatory variables (in multiple regression
contexts).
Mathematical Discussed
Assumption Possible Violation Implication
Expression in Chapter
Model
Wrong regressors, misspecification,
1. Linearity Yt=α+βXt+ut non-linearity, biased or Chapter 8
changing parameters inconsistent
estimates
Errors in variables,
Little or no variation
2. X is variable Var(X)≠0 inability to estimate Chapter 8
in XX
slope reliably
3. X is non- Endogeneity, Biased and
stochastic and fixed Cov(Xs,ut)=0 simultaneity, or inconsistent OLS Chapter 10
in repeated samples autoregression estimates
4. Mean of Systematic error in Biased intercept
E(ut)= 0 —
disturbance is zero model estimate
Unequal error Inefficient
5. Homoskedasticity Var(ut)=σ2 variance estimates, invalid Chapter 6
(heteroskedasticity) standard errors
Inefficient
Errors are correlated
6. No serial Cov(ut,us)=0 for estimates,
over time Chapter 7
correlation t≠s misleading
(autocorrelation)
inference
Invalid statistical
7. Normality of Outliers, skewness, or
ut∼N(0,σ2) tests and confidence Chapter 8
residuals kurtosis
intervals
No exact linear
relationships
8. No perfect Redundant or linearly Inability to estimate
among Chapter 5
multicollinearity dependent regressors model uniquely
independent
variables

Properties of the OLS Estimators

BLUE Property (Best Linear Unbiased Estimator)

Under the assumptions of the Classical Linear Regression Model (CLRM), the Ordinary Least
Squares (OLS) estimators are the Best Linear Unbiased Estimators (BLUE). This means that
among all linear and unbiased estimators, the OLS estimators have the smallest possible
variance.
To establish this, we break down the OLS estimators into two components: a non-random
component, which reflects the true parameter values, and a random component, which reflects
sampling variability. This randomness originates from the error term in the regression model.

Linearity

OLS estimators are linear functions of the observed dependent variable values. This means they
can be expressed as weighted averages of the dependent variable. Since the explanatory variables
(X values) are treated as fixed (non-stochastic), this confirms that the OLS estimators are linear.

Unbiasedness

An estimator is unbiased if its expected value equals the true parameter it estimates. Under the
CLRM assumptions, especially the assumption that the error terms have zero mean and are
uncorrelated with the regressors, both OLS estimators—β̂ (slope) and â (intercept)—are
unbiased. This implies that, on average, OLS will correctly estimate the true population
parameters.

Efficiency (Minimum Variance)

In addition to being linear and unbiased, OLS estimators are also efficient—they have the lowest
possible variance among all linear and unbiased estimators. This is proven by comparing the
OLS estimator to a general linear unbiased estimator and showing that the OLS estimator
satisfies the conditions for minimum variance.

Consistency

An estimator is consistent if, as the sample size increases indefinitely, the estimator converges to
the true parameter value. Even when the assumption that X is fixed is relaxed, the OLS
estimators remain consistent, provided that the regressors and the error term are uncorrelated.
This means that with a large enough sample, OLS will still produce values close to the true
population parameters.

Overall Goodness of Fit

To evaluate how well the regression model fits the data, we decompose each actual value of the
dependent variable into two parts: the predicted value from the regression equation and the
residual (or error). This decomposition allows us to assess how much of the total variation in the
dependent variable is explained by the model.

The total variation is called the Total Sum of Squares (TSS). It can be broken down into:

 Explained Sum of Squares (ESS): The part of the variation explained by the regression
model.
 Residual Sum of Squares (RSS): The part of the variation not explained by the model.

The key measure that arises from this decomposition is the coefficient of determination (R²),
which is calculated as:

R2 ² = {ESS} / {TSS}

R² indicates the proportion of the variation in the dependent variable that is explained by the
model:

 R² = 0: the model explains none of the variation.

 R² = 1: the model explains all the variation.
 R² between 0 and 1: the model explains some, but not all, of the variation.

An R² of 0.4, for example, means that 40% of the variation in the dependent variable is explained
by the regression model. It does not mean that the model is twice as good as one with R² = 0.2.

Here’s a comprehensive summary of the provided content, excluding derivations and formulas,
while retaining all key points:

Problems Associated with R² in Regression Analysis

There are several serious issues with using R² to evaluate single regression equations or to
compare different equations:

1. Spurious Regression: High R² values can appear even when variables are unrelated,
especially if they exhibit similar trends. This can mislead researchers into thinking a
relationship exists when it doesn’t.
2. Omitted Variable Bias: If an omitted variable (Zₜ) that actually determines the
dependent variable (Yₜ) is highly correlated with the included independent variable (Xₜ),
the R² may falsely indicate Xₜ is important.
3. Correlation ≠ Causation: A high R² only indicates correlation between observed and
predicted values, not causality. Determining causal relationships should rely on theory,
previous studies, and intuition.
4. Time Series vs. Cross-Sectional Data: Time series models often produce high R²
values, even if badly specified, due to trend components. Cross-sectional data usually
yield lower R² values because of more noise. Thus, R² comparisons across these data
types are invalid.
5. Low R² Doesn’t Imply a Poor Model: A low R² could result from the wrong functional
form, incorrect time period, or missing lagged variables — not necessarily from choosing
the wrong independent variable.
6. Incomparable R² from Different Models: R² values from models using different
transformations of Y (e.g., Yₜ vs. ln(Yₜ)) are not comparable, as R² reflects the proportion
of explained variance of the specific dependent variable used.

Hypothesis Testing and Confidence Intervals in OLS

Under the assumptions of the Classical Linear Regression Model (CLRM):

 OLS estimators (intercept and slope) follow a normal distribution.

 When standard errors are estimated, the relevant test statistics follow a Student’s t-
distribution with n − 2 degrees of freedom.
 The t-distribution is similar to the normal distribution but has fatter tails, especially with
small samples.

Testing the Significance of OLS Coefficients

Steps in Hypothesis Testing:

1. Set Hypotheses: Choose between two-tailed (e.g., β = 0 vs. β ≠ 0) or one-tailed tests

(e.g., β = 0 vs. β > 0), depending on prior knowledge.
2. Calculate the t-statistic: Often provided by software like EViews or Stata.
3. Find the Critical t-Value: Based on degrees of freedom (n − 2) and significance level.
4. Decision Rule: Reject the null if the absolute value of the t-statistic exceeds the critical
value.

If testing hypotheses other than β = 0 (e.g., β = 1), the null must be manually specified, and the t-
statistic calculated accordingly.

Rules of Thumb for Large Samples

 For 5% significance level:

o Two-tailed test: critical t ≈ ±2.
o One-tailed test: critical t ≈ ±1.65.
 These approximations are valid when degrees of freedom are >30.
 For smaller samples, exact t-table values should be used.

The p-Value Approach

 p-values give the exact probability of observing the test statistic under the null
hypothesis.
 A smaller p-value indicates stronger evidence against the null.
 If the p-value ≤ significance level (e.g., 0.05), the coefficient is statistically significant.
 More informative than just comparing t-values, especially when the choice of
significance level (1%, 5%, 10%) is arbitrary.

Confidence Intervals

 Confidence intervals indicate the range of values within which the true coefficient likely
falls, given a certain confidence level (e.g., 95%).
 They are constructed using the estimated coefficient, its standard error, and the
appropriate critical t-value.
 The same logic applies to both slope and intercept estimates.

🧪 How to Test If Your Model’s Numbers Matter

When you build a model, you want to know: Is this number actually important, or just
random? Here's how you check:

1. Set Up the Test

You make a guess — like “this number is zero” — and test it. If it’s not zero, that means
the variable matters.
2. Use a t-Statistic
It’s a number that helps you figure out how far your result is from your guess (usually
zero). Software like EViews will give it to you.
3. Find the “Critical” Value
It’s the cutoff. If your t-statistic is bigger than this number, you can say your variable is
important.
4. Make a Decision
If the t-stat is bigger than the critical value, you say, “Yep, this variable matters!”

🔍 What About p-Values?

A p-value tells you how likely it is that you got your result just by chance.

 Small p-value (like 0.01 or 0.04) = It’s probably not random → the variable is
important.
 Big p-value (like 0.3) = It’s probably just noise → the variable might not matter.
Rule: If the p-value is smaller than 0.05 (or whatever limit you choose), you say the variable is
statistically significant.

📏 Confidence Intervals (CIs)

A confidence interval tells you: “We’re pretty sure the real number is somewhere in this range.”

For example:
If you say the slope is 3, and your 95% CI is [1.5, 4.5], that means you’re 95% confident the real
slope is between 1.5 and 4.5.

66r-11selecting Probability Distribution Functions For Use in Cost& Schedule Risk Simulaiton Mofels
No ratings yet
66r-11selecting Probability Distribution Functions For Use in Cost& Schedule Risk Simulaiton Mofels
7 pages
ANSWERS With Marks EC402: Econometrics
No ratings yet
ANSWERS With Marks EC402: Econometrics
17 pages
Cramer Raoh and Out 08
No ratings yet
Cramer Raoh and Out 08
13 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Ch3 Slides Ed4 2024 20
No ratings yet
Ch3 Slides Ed4 2024 20
72 pages
1-Chap II Econometrics ABC DR Mitiku
No ratings yet
1-Chap II Econometrics ABC DR Mitiku
80 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
Ch3 Slides Ed4 2024
No ratings yet
Ch3 Slides Ed4 2024
72 pages
HW 1
No ratings yet
HW 1
9 pages
ECN 318 - Introductory Econometrics I Week 3 4
No ratings yet
ECN 318 - Introductory Econometrics I Week 3 4
39 pages
Chapter 3 Two Variable Regression Model
No ratings yet
Chapter 3 Two Variable Regression Model
7 pages
Econometrics for Finance Students
No ratings yet
Econometrics for Finance Students
64 pages
Econometria 2
No ratings yet
Econometria 2
16 pages
CLRM
No ratings yet
CLRM
15 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
21 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
2 Basic Regression
No ratings yet
2 Basic Regression
69 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Econometrics Jimma Assignment
No ratings yet
Econometrics Jimma Assignment
6 pages
Econometrics Theory Note
No ratings yet
Econometrics Theory Note
13 pages
Set Domande Econometria 2
No ratings yet
Set Domande Econometria 2
19 pages
DA Unit-3 Short Q&A
No ratings yet
DA Unit-3 Short Q&A
17 pages
Theme 2 Ordinary Least Squares Regression
No ratings yet
Theme 2 Ordinary Least Squares Regression
10 pages
CHP 3 Notes, Gujarati
No ratings yet
CHP 3 Notes, Gujarati
4 pages
Linear Regression for Statisticians
No ratings yet
Linear Regression for Statisticians
51 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages
Gauss Markov Theorem
No ratings yet
Gauss Markov Theorem
16 pages
OLS Regression Assumptions Guide
No ratings yet
OLS Regression Assumptions Guide
3 pages
Finance Students' Guide to Regression
No ratings yet
Finance Students' Guide to Regression
41 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
9 pages
Chapter 02
No ratings yet
Chapter 02
14 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
17 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Basic Econometrics - II
No ratings yet
Basic Econometrics - II
30 pages
Econometrics for Students
No ratings yet
Econometrics for Students
28 pages
Econometrics Part1 Notes
No ratings yet
Econometrics Part1 Notes
7 pages
Introduction To Econometrics - Summary
No ratings yet
Introduction To Econometrics - Summary
23 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
BRM - L4,5 - Linear Regression
No ratings yet
BRM - L4,5 - Linear Regression
113 pages
Name: Adewole Oreoluwa Adesina Matric. No.: RUN/ACC/19/8261 Course Code: Eco 307
No ratings yet
Name: Adewole Oreoluwa Adesina Matric. No.: RUN/ACC/19/8261 Course Code: Eco 307
13 pages
Lecture 2. Simple Linear Regression
No ratings yet
Lecture 2. Simple Linear Regression
49 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Advanced Econometrics 1 (Lecture of 5 August 2025)
No ratings yet
Advanced Econometrics 1 (Lecture of 5 August 2025)
13 pages
Linear Regression Models Guide
No ratings yet
Linear Regression Models Guide
42 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Two-Variable Regression Model Basics
No ratings yet
Two-Variable Regression Model Basics
17 pages
Derex Econom
No ratings yet
Derex Econom
13 pages
Two-Variable Regression Model - The Problem of Estimation
No ratings yet
Two-Variable Regression Model - The Problem of Estimation
35 pages
Regression: Dr. Agustinus Suryantoro, M.S
No ratings yet
Regression: Dr. Agustinus Suryantoro, M.S
31 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
125.785 Module 2.1
No ratings yet
125.785 Module 2.1
94 pages
Wooldridge Notes
No ratings yet
Wooldridge Notes
15 pages
Ols 2
No ratings yet
Ols 2
19 pages
Ecom 165 Notes
No ratings yet
Ecom 165 Notes
98 pages
ECN 318 Lecture Notes Weeks 3-4
No ratings yet
ECN 318 Lecture Notes Weeks 3-4
25 pages
Ecc 284
No ratings yet
Ecc 284
2 pages
Econometrics Cheatsheet en
No ratings yet
Econometrics Cheatsheet en
3 pages
Lecture 3
No ratings yet
Lecture 3
27 pages
Chapter 8
No ratings yet
Chapter 8
1 page
Cobb Douglas Us
No ratings yet
Cobb Douglas Us
5 pages
Chapter 5
No ratings yet
Chapter 5
5 pages
Chapter 9 Dummy Variables
No ratings yet
Chapter 9 Dummy Variables
26 pages
AdvStats Study Guide Week 1
No ratings yet
AdvStats Study Guide Week 1
33 pages
Barndorff-Nielsen, Hansen, Lunde & Shephard (2009)
No ratings yet
Barndorff-Nielsen, Hansen, Lunde & Shephard (2009)
32 pages
Lecture7 Estimation
No ratings yet
Lecture7 Estimation
18 pages
Numerical Methods For Engineering Design and Optimization
No ratings yet
Numerical Methods For Engineering Design and Optimization
19 pages
ES12010 - Extra Exercise WK26 Solutions
No ratings yet
ES12010 - Extra Exercise WK26 Solutions
4 pages
Stats Reviewer
No ratings yet
Stats Reviewer
8 pages
MiniProjectReport Edit 1 (1) Latest (1) 2 (1) New
No ratings yet
MiniProjectReport Edit 1 (1) Latest (1) 2 (1) New
38 pages
Stats - Lesson 4.4.1 Student
No ratings yet
Stats - Lesson 4.4.1 Student
12 pages
Does Management Matter
No ratings yet
Does Management Matter
54 pages
Panel Data: Fixed vs. Random Effects
No ratings yet
Panel Data: Fixed vs. Random Effects
8 pages
Traffic Engineering Assignment Guide
No ratings yet
Traffic Engineering Assignment Guide
4 pages
Asymptotic Statistics (By Changliang ZOU)
No ratings yet
Asymptotic Statistics (By Changliang ZOU)
115 pages
Chapter 4 Lesson 2
No ratings yet
Chapter 4 Lesson 2
56 pages
An Aggregatedisaggregate Intermittent Demand Approach Adida To Forecasting
No ratings yet
An Aggregatedisaggregate Intermittent Demand Approach Adida To Forecasting
11 pages
ECC321 Chapter2
No ratings yet
ECC321 Chapter2
5 pages
Confidence Interval Estimation Guide
No ratings yet
Confidence Interval Estimation Guide
46 pages
STA301 Subjective Questions Short Notes DOWNLOADPDF
No ratings yet
STA301 Subjective Questions Short Notes DOWNLOADPDF
22 pages
ISI MStat 05
No ratings yet
ISI MStat 05
4 pages
Intensity Level Resolution Refers To The Number of
No ratings yet
Intensity Level Resolution Refers To The Number of
25 pages
GMM Estimation in Stata 11
No ratings yet
GMM Estimation in Stata 11
27 pages
Nonlife Actuarial Models: Model Estimation and Types of Data
No ratings yet
Nonlife Actuarial Models: Model Estimation and Types of Data
35 pages
Schmalensee 1985
No ratings yet
Schmalensee 1985
9 pages
Enders4 - APPLIED ECONOMETRIC TIME SERIES
No ratings yet
Enders4 - APPLIED ECONOMETRIC TIME SERIES
42 pages
Sampling Distribution and Central Limit Theorem: Session 2
No ratings yet
Sampling Distribution and Central Limit Theorem: Session 2
19 pages
Chapter 6
No ratings yet
Chapter 6
7 pages
SPADE User Guide: Biodiversity Estimation
No ratings yet
SPADE User Guide: Biodiversity Estimation
71 pages
TP05 Econometrics p1
No ratings yet
TP05 Econometrics p1
22 pages

Introduction To Regression Chapter 2

Uploaded by

Introduction To Regression Chapter 2

Uploaded by

Introduction to Regression: The Classical Linear Regression

Regression analysis is a fundamental econometric method used to:

 Reduce uncertainty by quantifying relationships between variables.

However, building a good model requires careful effort:

 Including too many variables may introduce irrelevant information (overfitting,

The Classical Linear Regression Model (CLRM)

 A dependent variable (Y) – the one we want to explain or predict.

 E(Yt): Expected value of Y at time t.

Why Does the Disturbance Term utu_t Exist?

1. Omitted Variables: Not all relevant factors affecting Y may be included.

Can We Estimate the Population Regression Function?

 The true (population) regression is unknown and unobservable.

Fitting a Line to the Data:

Several naïve methods to fit a line:

1. Drawing a line by eye.

These are subjective and imprecise.

The Proper Method: Ordinary Least Squares (OLS)

OLS is the focus of the next part of the discussion or chapter.

Ordinary Least Squares (OLS) Method of Estimation (No

2. Why Use the OLS Method?

 Unbiased: On average, it gives the correct parameter values.

Why Minimize Squared Residuals?

OLS has useful properties:

OLS provides estimates for:

Once estimated, the regression line can be used to:

1. Collect sample data for X and Y.

The Assumptions of the Classical Linear Regression Model

The model is linear in parameters:

The relationship between Y and X is assumed to be linear in form.

 Xt is not influenced by the random error ut

4. Zero Mean of the Disturbance Term

The expected value of the error term is zero:

The error terms have constant variance:

The error terms are not correlated with each other:

Cov(ut,us)=0 for t≠s

7. Normality of the Error Terms

The disturbances are normally distributed:

8. Sufficient Sample Size and No Perfect Multicollinearity

 The number of observations must exceed the number of estimated parameters.

Properties of the OLS Estimators

BLUE Property (Best Linear Unbiased Estimator)

Efficiency (Minimum Variance)

Overall Goodness of Fit

 R² = 0: the model explains none of the variation.

Problems Associated with R² in Regression Analysis

Hypothesis Testing and Confidence Intervals in OLS

Under the assumptions of the Classical Linear Regression Model (CLRM):

 OLS estimators (intercept and slope) follow a normal distribution.

Testing the Significance of OLS Coefficients

Steps in Hypothesis Testing:

1. Set Hypotheses: Choose between two-tailed (e.g., β = 0 vs. β ≠ 0) or one-tailed tests

Rules of Thumb for Large Samples

 For 5% significance level:

The p-Value Approach

🧪 How to Test If Your Model’s Numbers Matter

1. Set Up the Test

🔍 What About p-Values?

📏 Confidence Intervals (CIs)

You might also like