0% found this document useful (0 votes)

68 views43 pages

Chapter 7. Software Application

The document discusses linear regression analysis using Stata software. It provides examples of using Stata to estimate linear regression models and perform diagnostic tests on the models. Specifically, it estimates a model of factors influencing daily calorie intake using the food_security database. It finds that family size, access to irrigation, and access to off-farm activities negatively impact calorie intake, while income and fertilizer use positively impact it. The document also discusses how to test the linear regression assumptions, including for normality, heteroskedasticity, multicollinearity, and correct model specification. Finally, it provides a numerical example estimating a logit model of factors affecting student academic performance.

Uploaded by

berhanu seyoum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

68 views43 pages

Chapter 7. Software Application

Uploaded by

berhanu seyoum

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Advanced Research Methods and Software

Application

Baro Beyene

OSU
2022

1
Content

Chapter 8

Stata Software Application on Some

Econometric Models

2
1. Linear Regression Analysis

This section describes the use of Stata to do regression

analysis. Regression analysis involves estimating an
equation that best describes the data.

One variable is considered the dependent variable,

while the others are considered independent (or
explanatory) variables.

11/3/2022 3
Linear Regression Analysis

Stata is capable of many types of regression analysis

and associated statistical test.

In this section, we touch on only a few of the more

common commands and procedures. The commands
described in this section are:

11/3/2022 4
Linear Regression Analysis

Consider the database food_security to estimate a linear model of

determinants of daily calorie intake (lncalav). Suppose the factors
influencing daily calorie intake per capita of households are
farming system (farmsy), gender (femal), family size (famsz),
land allocate to staples (landst), access to irrigation (irrig),
fertilizer quantity used for crop productio (frtqt), oxen used for
draught power (oxen), (log of) annual gross income in ETB
(lninom), and access to off−farm activities (ofarm).

11/3/2022 5
Linear Regression Analysis

Based on this information workout the following problems:

a. Estimate the OLS model for determinants of (log) daily calorie
intake per adult equivalent of farm households in the study area.
b. Which variables are positively/adversely and significantly
affecting daily calorie intake?
c. How do you interpret the fitness of this OLS model to identify
determinants of calorie intake?

11/3/2022 6
Linear Regression Analysis

11/3/2022 7
Linear Regression Analysis
• According to the OLS model outputs, five variables
(famsz, irrig, frtqt, lninom, and ofarm) are statistically
significant variables affecting daily calorie intake of
households.
• Family size, access to irrigation, and access to off-farm
activities are factors adversely affecting daily calorie
intake while the remaining two variables enhance
calorie intake of households.
• About 19.3% of the variation in daily calorie intake is
explained by this OLS model

11/3/2022 8
Linear Regression Analysis
• However, interpretation of OLS model outputs is
possible if and only if the basic assumptions of
classical linear regression model are satisfied.
• There are many post-estimation tests used to check the
satisfaction of the basic assumptions of multiple linear
regression model.
• Tests for heteroscedasticity, omitted variables and
multicollinearity are the most important postestimation
tests that must be reported with the OLS model
outputs.

11/3/2022 9
Linear Regression Analysis

Post estimation test of OLS Regression/ Diagnostic Test

-Multicollinearity

-Hetroscedasticity

-Autocorrelation ( time series data)

-Normality

-Model Misspecification

11/3/2022 10
Diagnostic Test/Post Estimation Tests
1. Tests for Normality of Residuals
– kdensity -- produces kernel density plot with normal
distribution over laid.

11/3/2022 11
Diagnostic Test/Post Estimation Tests
1. Tests for Normality of Residuals
– pnorm -- graphs a standardized normal probability (P-P) plot.
– qnorm --- plots the quantiles of varname against the quantiles of
a normal distribution.
– mvtest normal residual (Doornik-Hansen test for multivariate
normality)

11/3/2022 12
Diagnostic Test/Post Estimation Tests
1. Tests for Normality of Residuals
– The Skewness-Kurtosis (Jarque-Bera) test
H0: the residual distribution is normal

11/3/2022 13
Diagnostic Cont…
2. Tests for Heteroscedasticity
– hettest – performs Breusch-Pagan/ Cook and Weisberg test
for heteroscedasticity.
– H0: No heteroscedasticity

11/3/2022 14
Diagnostic Cont…
2. Tests for Heteroscedasticity
– imtest-- computes the White general test for
Heteroscedasticity
– H0: No heteroscedasticity

11/3/2022 15
Diagnostic Cont…

3. Tests for Multicollinearity

Vif: Calculates the variance inflation factor for

the independent variables in the linear model.
This test involves the regression of one explanatory
variables on another explanatory variable and if the
auxiliary R2 is greater than 0.9, there is a problem of
Multicollinearity between explanatory variables.
11/3/2022 16
Diagnostic Cont…

3. Tests for Multicollinearity

11/3/2022 17
Diagnostic Cont…

4. Tests for Model Specification

– linktest -- performs a link test for model specification.
– H0: No specification problem
– The value of y hat square should not be significantly contributing to
the test model for correctly specified model.

11/3/2022 18
Diagnostic Cont…

4. Tests for Model Specification

– ovtest -- performs regression specification error test (RESET) for omitted
variables.
– H0: No omitted variable

11/3/2022 19
2 Linear Probability Model
Linear Probability Model(LPM)

Y=1

Y=0 X=income
X=X1 X=X2
Linear Probability Model(LPM)
Linear Probability Model(LPM)

These nonlinear regression models include:

A. The Logit Model Binary Choice Models

B. The Probit Model

C. Multinomial Logit and Probit Model (MNL & MNP)

D. Ordered Logit and Probit Model.

Numerical Example on LPM
Using LPM data from your Stata training folder, regress
poverty on family size and migration, test for
heteroskedasticity, normality and multicollinearity.
. reg poverty fs migration

Source SS df MS Number of obs = 20

F(2, 17) = 13.35
Model 2.93291409 2 1.46645705 Prob > F = 0.0003
Residual 1.86708591 17 .109828583 R-squared = 0.6110
Adj R-squared = 0.5653
Total 4.8 19 .252631579 Root MSE = .3314

poverty Coef. Std. Err. t P>|t| [95% Conf. Interval]

fs .0794074 .034043 2.33 0.032 .007583 .1512318

migration -.3628224 .203684 -1.78 0.093 -.792558 .0669132
_cons .2660414 .2456304 1.08 0.294 -.2521936 .7842763
There is negative predicted
probabilities for some
Numerical Example on LPM observations and probability
greater than one.

Test of normality: mvtest normal e

. mvtest normal e

Test for multivariate normality

Doornik-Hansen chi2(2) = 9.916 Prob>chi2 = 0.0070

Test of heteroskedasticity : hettest

. hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity

Ho: Constant variance
Variables: fitted values of poverty

chi2(1) = 2.80
Prob > chi2 = 0.0942
3 The Logit and Probit Models
Logit and Probit
Logit and Probit

This requires a nonlinear functional form for the

probability. This can be possible if we assume that the
dependent or the error term (Ui) follows some sorts of
cumulative distribution functions.

The two important nonlinear functions which are

proposed for this are the logistic CDF and the normal
CDF.

Pr (Yi=1/Xi) = Pi = G (β0 + β1Xi) = G (Zi)

Logit and Probit
Logit and Probit
Logit and Probit
Logit and Probit
Logit and Probit
Pi

Cumulative Normal Distribution Function

Pi =1

Logistic Distribution Function

Pi=0

0
Logit and Probit
Numerical Example on Logit and Probit

Suppose that we want to examine factors affecting students

academic performance in particular course.

Assume that academic performance is measured by the

grades (A, B, C, D, F) scored by the students.

Further assume that data on three independent variables

namely; previous CGPA, PC ownerships and average score
in exercises (ASE) were collected from 32.
Numerical Example on Logit and Probit
Numerical Example on Logit and Probit

A. Logit Interpretation of Logit Model

logit grade gpa ase pc

Logistic regression Number of obs = 32

LR chi2(3) = 15.40
Prob > chi2 = 0.0015
Log likelihood = -12.889633 Pseudo R2 = 0.3740

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 2.826113 1.262941 2.24 0.025 .3507938 5.301432

ase .0951577 .1415542 0.67 0.501 -.1822835 .3725988
pc 2.378688 1.064564 2.23 0.025 .29218 4.465195
_cons -13.02135 4.931325 -2.64 0.008 -22.68657 -3.35613
Numerical Example on Logit and Probit

B. Odds Ratio Interpretation of Logit Model

logit grade gpa ase pc, or

Logistic regression Number of obs = 32

LR chi2(3) = 15.40
Prob > chi2 = 0.0015
Log likelihood = -12.889633 Pseudo R2 = 0.3740

grade Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

gpa 16.87972 21.31809 2.24 0.025 1.420194 200.6239

ase 1.099832 .1556859 0.67 0.501 .8333651 1.451502
pc 10.79073 11.48743 2.23 0.025 1.339344 86.93802
_cons 2.21e-06 .0000109 -2.64 0.008 1.40e-10 .03487
Numerical Example on Logit and Probit
C. Probability Interpretation of the logit model
. mfx

Marginal effects after logit

y = Pr(grade) (predict)
= .25282025

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

gpa .5338589 .23704 2.25 0.024 .069273 .998445 3.11719

ase .0179755 .02624 0.69 0.493 -.033448 .069399 21.9375
pc* .4564984 .18105 2.52 0.012 .10164 .811357 .4375

(*) dy/dx is for discrete change of dummy variable from 0 to 1

Numerical Example on Logit and Probit
D. Probit Estimation
probit grade gpa ase pc
Probit regression Number of obs = 32
LR chi2(3) = 15.55
Prob > chi2 = 0.0014
Log likelihood = -12.818803 Pseudo R2 = 0.3775

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 1.62581 .6938825 2.34 0.019 .2658255 2.985795

ase .0517289 .0838903 0.62 0.537 -.1126929 .2161508
pc 1.426332 .5950379 2.40 0.017 .2600795 2.592585
_cons -7.45232 2.542472 -2.93 0.003 -12.43547 -2.469166
Numerical Example on Logit and Probit

E. Probability Interpretation of Probit Model

. mfx

Marginal effects after probit

y = Pr(grade) (predict)
= .26580809

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

gpa .5333471 .23246 2.29 0.022 .077726 .988968 3.11719

ase .0169697 .02712 0.63 0.531 -.036184 .070123 21.9375
pc* .464426 .17028 2.73 0.006 .130682 .79817 .4375

(*) dy/dx is for discrete change of dummy variable from 0 to 1

Numerical Example on Logit and Probit

Logit: as GPA increases by one point, the log of the odds ratio
increases by 2.8 and statistically significant.

Odds ratio: as GPA increases by one point, the probability of

getting A is 16.87 times the probability of getting other grades
(B, C, D, F)

Marginal Effect: Both logit and probit give us similar results. As

GPA increases by one point, the probability of getting grade A by
the student increases by 53%.
Thank You!

11/3/2022 43

Multicategory Logit Models Guide
No ratings yet
Multicategory Logit Models Guide
49 pages
Polynomials Handout.
No ratings yet
Polynomials Handout.
23 pages
Econometrics Assignment
No ratings yet
Econometrics Assignment
17 pages
Binary Choice Models Explained
No ratings yet
Binary Choice Models Explained
14 pages
Econometrics For Management Assignment
No ratings yet
Econometrics For Management Assignment
3 pages
STATA Training For Staff
No ratings yet
STATA Training For Staff
23 pages
Contribution of Remittance - Shiva Adhikari
No ratings yet
Contribution of Remittance - Shiva Adhikari
29 pages
Assignment: Topic - Testing For Violation of OLS Assumptions
No ratings yet
Assignment: Topic - Testing For Violation of OLS Assumptions
50 pages
Diagnostic Tests
No ratings yet
Diagnostic Tests
44 pages
CH-4-Discrete Choice Models-PG (Compatibility Mode)
No ratings yet
CH-4-Discrete Choice Models-PG (Compatibility Mode)
93 pages
Module 6A
No ratings yet
Module 6A
25 pages
CH-4-Discrete Choice Models-Short
No ratings yet
CH-4-Discrete Choice Models-Short
58 pages
Eco No Metrics
No ratings yet
Eco No Metrics
5 pages
Empirical Model
No ratings yet
Empirical Model
20 pages
Exam Practice 2
No ratings yet
Exam Practice 2
6 pages
L10.2 2023
No ratings yet
L10.2 2023
64 pages
GFSSPUsers Manual 2
No ratings yet
GFSSPUsers Manual 2
463 pages
Outputs 1
No ratings yet
Outputs 1
3 pages
L6 TwoVarModel Extension 2023
No ratings yet
L6 TwoVarModel Extension 2023
65 pages
Lab Session I: Simultaneous Equations, Instrumental Variables and Two-Stage Least Squares (2SLS)
No ratings yet
Lab Session I: Simultaneous Equations, Instrumental Variables and Two-Stage Least Squares (2SLS)
36 pages
Lecture Set 5
No ratings yet
Lecture Set 5
54 pages
L8.2 2023
No ratings yet
L8.2 2023
70 pages
Mock Test Econ
No ratings yet
Mock Test Econ
2 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
Cap1 Slides
No ratings yet
Cap1 Slides
30 pages
Mock Test Econ
No ratings yet
Mock Test Econ
3 pages
Model Specification
No ratings yet
Model Specification
2 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
Stata Session 10 1
No ratings yet
Stata Session 10 1
3 pages
Abebaw After LAT - April 08-2015
No ratings yet
Abebaw After LAT - April 08-2015
122 pages
Abay Fana Dairy Farm Investment Analysis
100% (1)
Abay Fana Dairy Farm Investment Analysis
42 pages
Operations Research Assignments
No ratings yet
Operations Research Assignments
5 pages
AE6207 - Solution 1 - 2024
No ratings yet
AE6207 - Solution 1 - 2024
8 pages
Mock Test Econ
No ratings yet
Mock Test Econ
9 pages
19SE22 - Syllabus - MATRIX AND FINITE ELEMENT METHOD OF ANALYSIS
No ratings yet
19SE22 - Syllabus - MATRIX AND FINITE ELEMENT METHOD OF ANALYSIS
2 pages
Notes 9
No ratings yet
Notes 9
57 pages
CHAPTER 4 Simplex Method
No ratings yet
CHAPTER 4 Simplex Method
26 pages
Econometrics Chapter 8 PPT Slides
100% (1)
Econometrics Chapter 8 PPT Slides
42 pages
Econometrics for Business Students
No ratings yet
Econometrics for Business Students
10 pages
L5 TwoVarReg Hypothesis Testing 2023
No ratings yet
L5 TwoVarReg Hypothesis Testing 2023
7 pages
Intergrated Problem
No ratings yet
Intergrated Problem
8 pages
Stata Output Logit
No ratings yet
Stata Output Logit
3 pages
Ees 404
No ratings yet
Ees 404
10 pages
Topology Optimization of A Wing Box Rib
100% (1)
Topology Optimization of A Wing Box Rib
131 pages
Chapter 4
No ratings yet
Chapter 4
11 pages
RM2017 Midterm Questions
No ratings yet
RM2017 Midterm Questions
9 pages
Stata Results
No ratings yet
Stata Results
4 pages
Regression 101
No ratings yet
Regression 101
46 pages
Algebra Basics for 8th Graders
No ratings yet
Algebra Basics for 8th Graders
71 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
Assignement 1 .Hridita. BUS 525
No ratings yet
Assignement 1 .Hridita. BUS 525
10 pages
Simple Linear Regression Interpretation PDF
No ratings yet
Simple Linear Regression Interpretation PDF
2 pages
Application of Ls-Dyna in Numerical Analysis of Vehicle Trajectories
No ratings yet
Application of Ls-Dyna in Numerical Analysis of Vehicle Trajectories
8 pages
Resultados Stata
No ratings yet
Resultados Stata
11 pages
Business Stats Exam Guide
No ratings yet
Business Stats Exam Guide
11 pages
Algebra2 2 Mid Unit Assessment Assessment - For - Polynomials - 7
No ratings yet
Algebra2 2 Mid Unit Assessment Assessment - For - Polynomials - 7
4 pages
Kidoti's Analysis Powerpoints Final
No ratings yet
Kidoti's Analysis Powerpoints Final
18 pages
Econometric Project - Permanent Income Hypothesis
No ratings yet
Econometric Project - Permanent Income Hypothesis
9 pages
2009-10 T.Y.B.SC. Electronics
No ratings yet
2009-10 T.Y.B.SC. Electronics
26 pages
Right Edible Oil Processors PLC-YY
No ratings yet
Right Edible Oil Processors PLC-YY
40 pages
Midterm GR5412 2019 PDF
No ratings yet
Midterm GR5412 2019 PDF
2 pages
University of Gujrat: Department of Management Sciences
No ratings yet
University of Gujrat: Department of Management Sciences
10 pages
Nonlinear Regression for Analysts
No ratings yet
Nonlinear Regression for Analysts
59 pages
Final Lease Finance Policy & Procedure For SMEs
No ratings yet
Final Lease Finance Policy & Procedure For SMEs
106 pages
Classification of The Real Roots of The Quartic Equation and Their Pythagorean Tunes
No ratings yet
Classification of The Real Roots of The Quartic Equation and Their Pythagorean Tunes
14 pages
CH 6 Slides
No ratings yet
CH 6 Slides
59 pages
Berhanu Seyoum Final Thesis
No ratings yet
Berhanu Seyoum Final Thesis
88 pages
Lnq = Β + Β Lnli + Β Lnki + Ɛ
No ratings yet
Lnq = Β + Β Lnli + Β Lnki + Ɛ
12 pages
Research Method: Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods
No ratings yet
Research Method: Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods
47 pages
Chapter 4
No ratings yet
Chapter 4
102 pages
Modelo Multiple
No ratings yet
Modelo Multiple
51 pages
Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
YD Slides5 NonLin
No ratings yet
YD Slides5 NonLin
54 pages
Regression 101
No ratings yet
Regression 101
46 pages
Assignment
No ratings yet
Assignment
6 pages
Assignment Econometrics
No ratings yet
Assignment Econometrics
7 pages
Aries Agro Processing PLC Final
No ratings yet
Aries Agro Processing PLC Final
21 pages
Wheat Flour & Biscuit Factory Costs
No ratings yet
Wheat Flour & Biscuit Factory Costs
62 pages
Berhanu - Thesis For Presentation
No ratings yet
Berhanu - Thesis For Presentation
69 pages
Ethiopian Bank Loan Challenges
No ratings yet
Ethiopian Bank Loan Challenges
80 pages
Hassen
No ratings yet
Hassen
67 pages
Opt. Lec 2
No ratings yet
Opt. Lec 2
9 pages
Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)
No ratings yet
Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)
15 pages
Solution To Q9 (Vii) Tut-Sheet 3 (By Professor Santanu Dey)
No ratings yet
Solution To Q9 (Vii) Tut-Sheet 3 (By Professor Santanu Dey)
1 page
Regression
No ratings yet
Regression
7 pages
Assignment 1 (Group Assignment)
No ratings yet
Assignment 1 (Group Assignment)
3 pages
Baulkham Hills 2015 3U CT1 & Solutions
No ratings yet
Baulkham Hills 2015 3U CT1 & Solutions
11 pages
Lecture 1: Overview of Scientific Computing: Hao Wu
No ratings yet
Lecture 1: Overview of Scientific Computing: Hao Wu
56 pages
MS4032 - 01 - Introduction To Finite Element
No ratings yet
MS4032 - 01 - Introduction To Finite Element
52 pages
Change Management
No ratings yet
Change Management
15 pages
Elastoplasticity Algorithms Guide
No ratings yet
Elastoplasticity Algorithms Guide
66 pages
Influence and Outliers
No ratings yet
Influence and Outliers
37 pages
Crank Nicolson Scheme for PDEs
No ratings yet
Crank Nicolson Scheme for PDEs
9 pages
Finite Element Method For The One-Dimensional Telegraph Equation
No ratings yet
Finite Element Method For The One-Dimensional Telegraph Equation
13 pages
Simplex 5
No ratings yet
Simplex 5
11 pages
Combinatorial Determinants Explained
No ratings yet
Combinatorial Determinants Explained
5 pages
A Primal-Dual Method For Solving Linear PDF
No ratings yet
A Primal-Dual Method For Solving Linear PDF
22 pages
Best Approach Manoj Chauhan Sir: Mathematics Polynomials
No ratings yet
Best Approach Manoj Chauhan Sir: Mathematics Polynomials
2 pages
Quadratic Equation and Formula
No ratings yet
Quadratic Equation and Formula
6 pages

Chapter 7. Software Application

Uploaded by

Chapter 7. Software Application

Uploaded by

Advanced Research Methods and Software

Stata Software Application on Some

This section describes the use of Stata to do regression

One variable is considered the dependent variable,

Stata is capable of many types of regression analysis

In this section, we touch on only a few of the more

Consider the database food_security to estimate a linear model of

Based on this information workout the following problems:

Post estimation test of OLS Regression/ Diagnostic Test

-Autocorrelation ( time series data)

3. Tests for Multicollinearity

Vif: Calculates the variance inflation factor for

3. Tests for Multicollinearity

4. Tests for Model Specification

4. Tests for Model Specification

These nonlinear regression models include:

A. The Logit Model Binary Choice Models

B. The Probit Model

C. Multinomial Logit and Probit Model (MNL & MNP)

D. Ordered Logit and Probit Model.

Source SS df MS Number of obs = 20

poverty Coef. Std. Err. t P>|t| [95% Conf. Interval]

fs .0794074 .034043 2.33 0.032 .007583 .1512318

Test of normality: mvtest normal e

Test for multivariate normality

Doornik-Hansen chi2(2) = 9.916 Prob>chi2 = 0.0070

Test of heteroskedasticity : hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity

This requires a nonlinear functional form for the

The two important nonlinear functions which are

Pr (Yi=1/Xi) = Pi = G (β0 + β1Xi) = G (Zi)

Cumulative Normal Distribution Function

Logistic Distribution Function

Suppose that we want to examine factors affecting students

Assume that academic performance is measured by the

Further assume that data on three independent variables

A. Logit Interpretation of Logit Model

Logistic regression Number of obs = 32

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 2.826113 1.262941 2.24 0.025 .3507938 5.301432

B. Odds Ratio Interpretation of Logit Model

Logistic regression Number of obs = 32

grade Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]

gpa 16.87972 21.31809 2.24 0.025 1.420194 200.6239

Marginal effects after logit

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

gpa .5338589 .23704 2.25 0.024 .069273 .998445 3.11719

(*) dy/dx is for discrete change of dummy variable from 0 to 1

grade Coef. Std. Err. z P>|z| [95% Conf. Interval]

gpa 1.62581 .6938825 2.34 0.019 .2658255 2.985795

E. Probability Interpretation of Probit Model

Marginal effects after probit

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

gpa .5333471 .23246 2.29 0.022 .077726 .988968 3.11719

(*) dy/dx is for discrete change of dummy variable from 0 to 1

Odds ratio: as GPA increases by one point, the probability of

Marginal Effect: Both logit and probit give us similar results. As

You might also like