0% found this document useful (0 votes)

23 views10 pages

Econometrics Project

Took GDP of Pakistan data from Kaggle, apply mulltiple linear regression and then check assumptions using python

Uploaded by

tahreemasif18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views10 pages

Econometrics Project

Took GDP of Pakistan data from Kaggle, apply mulltiple linear regression and then check assumptions using python

Uploaded by

tahreemasif18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Econometrics Project

Submitted by: Menahil

Roll no. 39-20

BS Statistics Regular 2020-2024

Submitted to: Miss Wajeeha Batool

COLLEGE OF STATISTICAL SCIENCES,

PUNJAB UNIVERSITY LAHORE

1 Data Description:

Annual time series secondary data set contains the GDP of Pakistan from the year 2000 to 2021,
broken down into various sectors such as Agriculture, Industrial, Services, and other
components. These sectors play an essential role in the Pakistani economy. The GDP is
dependent variable. The values of ‘GDP’ are in US dollars (Billions). The ‘per capita is an
annual growth in rupee relative to the U.S. dollar and the values of ‘growth rate is in percentage.
The sectors that contribute to GDP of Pakistan are as follows:

Variables Units Sources

Crops % State bank of Pakistan
Livestock % State bank of Pakistan
Forestry % State bank of Pakistan
Fishing % State bank of Pakistan
Total Agricultural sectors % State bank of Pakistan
Mining and Quarrying % State bank of Pakistan
Manufacturing % State bank of Pakistan
Large Scale % State bank of Pakistan
Small Scale % State bank of Pakistan
Slaughtering % State bank of Pakistan
Electricity generation & distribution % State bank of Pakistan
and Gas distribution
Construction % State bank of Pakistan
total Industrial Sectors % State bank of Pakistan
Wholesale & Retail trade % State bank of Pakistan
Transport, Storage & % State bank of Pakistan
Communication
Finance & Insurance % State bank of Pakistan
Housing Services % State bank of Pakistan
General Government Services % State bank of Pakistan
Other Services % State bank of Pakistan
total Services Sector % State bank of Pakistan
Gross Domestic Product $(billion) State bank of Pakistan
Per Capita $ The World Bank
Growth rate % The World Bank
Reference:

This data is sourced from Kaggle.

Hanzlanawaz, H. (n.d.). Contribution of various sectors to Pakistan's GDP. Retrieved from

https://www.kaggle.com/datasets/hanzlanawaz/contribution-of-various-sectors-to-pakistans-gdp

2 Check Normality

2.1 RESULTS OF SHAPIRO-WILK TEST:

Shapiro-Wilk Statistic: 0.3453448414802551
p-value: 1.299554386725585e-39
The data does not appear to be normally distributed.

2.2 RESULTS OF BOX-COX TEST:

Shapiro-Wilk Statistic: 0.9175485968589783
p-value: 0.06764359027147293
The data appears to be normally distributed.

3 Results of fitting regression model on the data:

3.1 VALUES OF COEFFICIENTS

Variables Coefficients
Crops -8.25
Livestock -27.03
Forestry -57.14
Fishing 8.46
Total Agricultural sectors 84.48
Mining and Quarrying 51.27
Manufacturing -20.42
Large Scale 67.09
Small Scale -80.03
Slaughtering 181.98
Electricity generation & distribution and Gas distribution -57.08
Construction -3.67
total Industrial Sectors -0.98
Wholesale & Retail trade 29.28
Transport, Storage & Communication -7.64
Finance & Insurance -21.60
Housing Services 1.41
General Government Services -10.22
Other Services 0.73
total Services Sector -28.04
Gross Domestic Product 25.71
Per Capita 0.21
Growth rate -0.93
Interpretation:

The coefficients represent the change in the dependent variable for a one-unit change in the
independent variable, while holding all other independent variables constant.
Mean Squared Error 913.77
R-squared 0.85
Interpretation:

The MSE of 913.77 suggests that the model is not very accurate in its predictions. The R-squared
value of 0.85 indicates that approximately 85% of the variation in the dependent variable can be
explained by the independent variables in the model.
4 Residuals to check Heteroscedasticity

4.1 GRAPHICAL REPRESENTATION TO CHECK HETEROSCEDASTICITY

Interpretation:
From the provided plot, the residuals appear to be randomly scattered around the horizontal axis,
suggesting that there is no clear evidence of heteroscedasticity. The variance of the residuals
seems fairly constant across the different levels of fitted values.

4.2 RESULT OF BREUSCH-PAGAN TEST

'LM Statistic': 22.0

'LM-Test p-value': 0.5202517804007958
Interpretation:
This p-value is significantly higher than the typical significance level of 0.05.
A high p-value indicates that we fail to reject the null hypothesis, which states that there is
homoscedasticity (constant variance of residuals).
Therefore, this result suggests that there is no evidence of heteroscedasticity in this model.
4.3 RESULTS OF WHITE'S TEST

'Test Statistic': 22.0

'Test Statistic p-value': 0.39950988556124917
Interpretation:
The White's test result, with a Test Statistic of 22.0 and a p-value of 0.3995, indicates that there
is no significant evidence of heteroscedasticity in your regression model's residuals. This high p-
value suggests that the residuals have constant variance, supporting the assumption of
homoscedasticity.

5 Results of fitting MLR model

5.1 OLS REGRESSION RESULTS

OLS Regression Results

==============================================================================
Dep. Variable: GDP R-squared: 1.000
Model: OLS Adj. R-squared: nan
Method: Least Squares F-statistic: nan
Date: Wed, 19 Jun 2024 Prob (F-statistic): nan
Time: 14:31:03 Log-Likelihood: 457.67
No. Observations: 22 AIC: -871.3
Df Residuals: 0 BIC: -847.3
Df Model: 21
Covariance Type: nonrobust

1. Log-Likelihood: 457.67

The log-likelihood value of 457.67 suggests that the linear regression model is able to fit the data
reasonably well. The log-likelihood is a measure of how well the model fits the data, with higher
values indicating a better fit.

2. AIC: -871.3

The Akaike Information Criterion (AIC) value of -871.3 is a measure of the model's goodness of
fit, taking into account the number of parameters in the model. A lower AIC value generally
indicates a better-fitting model.

3. BIC: -847.3

The Bayesian Information Criterion (BIC) value of -847.3 is another measure of the model's
goodness of fit, with a lower value indicating a better-fitting model.
Interpretation

The combination of the high log-likelihood, small sample size, and the unusual degrees of freedom for the
residuals suggests that the multiple linear regression model be the most appropriate choice for this
dataset.

5.2 CHECK ASSUMPTIONS OF MLR

5.2.1 Linearity

5.2.2 Normality of residuals

Shapiro-Wilk Statistic: 0.95
p-value: 0.36
The residuals appear to be normally distributed.

5.2.3 Heteroscedasticity
Levene Statistic: 59.94
p-value: 0.00
The residuals do not have constant variance (heteroscedasticity).
5.2.4 Multicollinearity
Feature VIF
Crops 3.115103e+09

Livestock 1.325875e+08

Forestry 2.608219e+08

Fishing 7.858197e+05

Total Agricultural sectors 4.351817e+05

Mining and Quarrying 1.203756e+09

Manufacturing 2.189421e+07

Large Scale 5.469523e+08

Small Scale 1.031750e+09

Slaughtering 2.039360e+07

Electricity generation & 1.220299e+07

distribution and Gas

distribution
Construction 6.906859e+06

total Industrial Sectors 1.183032e+07

Wholesale & Retail trade 5.821126e+08

Transport, Storage & 3.408510e+07

Communication
Finance & Insurance 1.822213e+07

Housing Services 1.342569e+06

General Government Services 3.625816e+06

Other Services 3.870344e+06

total Services Sector 1.002342e+07

Per Capita 1.286946e+09

Growth rate 9.746339e+02

5.2.5 Independence of errors

Interpretation:

The residuals plot shows a clear pattern, indicating potential autocorrelation and suggesting that
the errors are not independent. This violates the assumption of independence of errors in MLR as
the residuals should ideally be randomly scattered around the horizontal axis without any
discernible pattern.

6 Result of goodness of fit test:

Chi-square goodness of fit test is used:
Chi-square statistic 9.818181818181817

P-value 0.3654040928300495
Fail to reject the null hypothesis: The observed distribution is not significantly different from the
expected distribution.

Model Fit: Because we fail to reject the null hypothesis, we conclude that the observed data does not
significantly deviate from the expected distribution. Therefore, the model is a good fit for the observed
data.

Nursing Research and Statistics Multiple Choice Questions
100% (3)
Nursing Research and Statistics Multiple Choice Questions
14 pages
40 Multiple Choice Questions in Basic Statistics
89% (9)
40 Multiple Choice Questions in Basic Statistics
8 pages
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou
100% (7)
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou
49 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Vivek Econometrics Project Report
No ratings yet
Vivek Econometrics Project Report
8 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Math2831 Course Pack
No ratings yet
Math2831 Course Pack
246 pages
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou Instant Download
100% (4)
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou Instant Download
71 pages
Regression Classification
No ratings yet
Regression Classification
106 pages
Econo Labs
No ratings yet
Econo Labs
27 pages
Let Reviewer Statistics Multiple Choice
100% (2)
Let Reviewer Statistics Multiple Choice
5 pages
T04 PDF
No ratings yet
T04 PDF
3 pages
Foundations of Applied Statistical Methods 2nd Edition Hang Lee Newest Edition 2025
No ratings yet
Foundations of Applied Statistical Methods 2nd Edition Hang Lee Newest Edition 2025
152 pages
Stat 331 Course Notes
No ratings yet
Stat 331 Course Notes
79 pages
Econometric Model Error Detection
No ratings yet
Econometric Model Error Detection
7 pages
Classical And. Modern Regression With Applications: Duxbury
No ratings yet
Classical And. Modern Regression With Applications: Duxbury
7 pages
Regression Analysis Willey Publication
20% (5)
Regression Analysis Willey Publication
15 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
32 pages
SM Notes 2020
No ratings yet
SM Notes 2020
139 pages
Calculating Correlation Coefficients With Repeated Observations - Part 1
No ratings yet
Calculating Correlation Coefficients With Repeated Observations - Part 1
1 page
L21 ECO220 Print
No ratings yet
L21 ECO220 Print
16 pages
BEC 340 Econometrics I Course Outline
No ratings yet
BEC 340 Econometrics I Course Outline
6 pages
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou Complete Edition
No ratings yet
The Multivariate Social Scientist Introductory Statistics Using Generalized Linear Models Sofroniou Complete Edition
168 pages
Da Unit 3 R22
No ratings yet
Da Unit 3 R22
15 pages
FE - Final Project Report【0809v2】
No ratings yet
FE - Final Project Report【0809v2】
12 pages
College Data Analysis Report
No ratings yet
College Data Analysis Report
10 pages
Etc 2410 Notes
50% (2)
Etc 2410 Notes
133 pages
Cumulative Frequency Graphs Worksheet
No ratings yet
Cumulative Frequency Graphs Worksheet
1 page
Tutorial 8 - Questions
No ratings yet
Tutorial 8 - Questions
2 pages
Generalized Linear Models
100% (9)
Generalized Linear Models
243 pages
Exercise 8 Micro 110 2023
No ratings yet
Exercise 8 Micro 110 2023
3 pages
Oulier in R
No ratings yet
Oulier in R
8 pages
MUS2 Draft Contents November 2020
No ratings yet
MUS2 Draft Contents November 2020
14 pages
Decision Tree Chi Square
No ratings yet
Decision Tree Chi Square
12 pages
Data Collection and Conclusion
No ratings yet
Data Collection and Conclusion
4 pages
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model For Medical Data
No ratings yet
Stephen and Senthamarai Kannan (2017) - Detection of Outliers in Regression Model For Medical Data
7 pages
Regression Models in R Guide
No ratings yet
Regression Models in R Guide
137 pages
Prac 3
No ratings yet
Prac 3
8 pages
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
No ratings yet
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
89 pages
Course Notes18
No ratings yet
Course Notes18
113 pages
Unit 3 Da
No ratings yet
Unit 3 Da
20 pages
Control Chart A Statistical Process Cont
No ratings yet
Control Chart A Statistical Process Cont
10 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
Predictive Models and Machine Learning in Mitigating Supply Chain Disruptions in Healthcare & Retail Industry
No ratings yet
Predictive Models and Machine Learning in Mitigating Supply Chain Disruptions in Healthcare & Retail Industry
17 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
No ratings yet
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
481 pages
Unit 3
No ratings yet
Unit 3
24 pages
Stata Basics for Econometric Analysis
No ratings yet
Stata Basics for Econometric Analysis
58 pages
BU FCAI BS111 P&S Lec08
No ratings yet
BU FCAI BS111 P&S Lec08
66 pages
MAS-I Sample Questions
No ratings yet
MAS-I Sample Questions
8 pages
Revenue Management & Forecasting Techniques
No ratings yet
Revenue Management & Forecasting Techniques
42 pages
Chapter 4
No ratings yet
Chapter 4
63 pages
OECD Pension Schemes Analysis
No ratings yet
OECD Pension Schemes Analysis
17 pages
Econ 140 - Spring 2016 Section 8: Additional Exercises
No ratings yet
Econ 140 - Spring 2016 Section 8: Additional Exercises
4 pages
Bayesian Statistics For Small Area Estimation
No ratings yet
Bayesian Statistics For Small Area Estimation
36 pages
YEAR
No ratings yet
YEAR
14 pages
Statistical Testing and Prediction Using Linear Regression: Abstract
No ratings yet
Statistical Testing and Prediction Using Linear Regression: Abstract
10 pages
Econometrics with R for Students
No ratings yet
Econometrics with R for Students
392 pages
Applied Linear Regression
No ratings yet
Applied Linear Regression
9 pages
Econometrics for Advanced Learners
No ratings yet
Econometrics for Advanced Learners
129 pages
Sampling Methods and The Central Limit Theorem
No ratings yet
Sampling Methods and The Central Limit Theorem
20 pages
Stata Basics: Data & Regression Guide
No ratings yet
Stata Basics: Data & Regression Guide
59 pages
Statistics Confidence Intervals
No ratings yet
Statistics Confidence Intervals
3 pages
DLM
No ratings yet
DLM
16 pages
Lectures PowerPoints PDF
No ratings yet
Lectures PowerPoints PDF
459 pages
Introduction To Econometrics With R
No ratings yet
Introduction To Econometrics With R
400 pages
Factorial Design Principles Guide
No ratings yet
Factorial Design Principles Guide
28 pages
Chapter 4 - Measures of Position
No ratings yet
Chapter 4 - Measures of Position
11 pages
Regression Analysis Course Notes
No ratings yet
Regression Analysis Course Notes
73 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
ProbStat Lec08
No ratings yet
ProbStat Lec08
20 pages
HW4 Solution
No ratings yet
HW4 Solution
6 pages
Muestreo de Minerales
No ratings yet
Muestreo de Minerales
83 pages
Econometrics Lecture Notes
No ratings yet
Econometrics Lecture Notes
119 pages
Iter PDF
No ratings yet
Iter PDF
400 pages
Manuel PDF
No ratings yet
Manuel PDF
503 pages
Regression Analysis Essentials
No ratings yet
Regression Analysis Essentials
26 pages
Econometria Con R
No ratings yet
Econometria Con R
300 pages
CLRM Assumptions
No ratings yet
CLRM Assumptions
20 pages
Econometric S
100% (1)
Econometric S
348 pages
Stata Basics for Econometricians
100% (1)
Stata Basics for Econometricians
58 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
No ratings yet
Econ 3049: Econometrics: Department of Economics The University of The West Indies, Mona
16 pages

Econometrics Project

Uploaded by

Econometrics Project

Uploaded by

Econometrics Project

Submitted by: Menahil

Roll no. 39-20

BS Statistics Regular 2020-2024

Submitted to: Miss Wajeeha Batool

COLLEGE OF STATISTICAL SCIENCES,

PUNJAB UNIVERSITY LAHORE

Variables Units Sources

This data is sourced from Kaggle.

Hanzlanawaz, H. (n.d.). Contribution of various sectors to Pakistan's GDP. Retrieved from

2.1 RESULTS OF SHAPIRO-WILK TEST:

2.2 RESULTS OF BOX-COX TEST:

3 Results of fitting regression model on the data:

3.1 VALUES OF COEFFICIENTS

4.1 GRAPHICAL REPRESENTATION TO CHECK HETEROSCEDASTICITY

4.2 RESULT OF BREUSCH-PAGAN TEST

'LM Statistic': 22.0

'Test Statistic': 22.0

5 Results of fitting MLR model

5.1 OLS REGRESSION RESULTS

OLS Regression Results

5.2 CHECK ASSUMPTIONS OF MLR

5.2.2 Normality of residuals

Total Agricultural sectors 4.351817e+05

Mining and Quarrying 1.203756e+09

Large Scale 5.469523e+08

Small Scale 1.031750e+09

Electricity generation & 1.220299e+07

distribution and Gas

total Industrial Sectors 1.183032e+07

Wholesale & Retail trade 5.821126e+08

Transport, Storage & 3.408510e+07

Housing Services 1.342569e+06

General Government Services 3.625816e+06

Other Services 3.870344e+06

total Services Sector 1.002342e+07

Per Capita 1.286946e+09

Growth rate 9.746339e+02

6 Result of goodness of fit test:

You might also like