100% found this document useful (1 vote)

232 views7 pages

Multiple Regression for Educators

Multiple regression analysis allows predicting a dependent variable (travel time) based on two or more independent variables (miles traveled, number of deliveries). The document provides an example using data from 10 delivery trips to predict travel time based on miles traveled and number of deliveries. It outlines the assumptions and process for multiple regression, including generating variables, collecting data, checking relationships between variables, and using the best fitting model to make predictions.

Uploaded by

Rafael Berte

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

232 views7 pages

Multiple Regression for Educators

Uploaded by

Rafael Berte

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Multiple Regression Analysis

A Report Presentation in EDUC 303 Data Management and Statistical Analysis

Dr. Giena L. Odicta

Course Facilitator

Mary Vincentia P. Olilang-Beldia

Presenter

Introduction
The world as we have known and as we have experienced, is a complex place so when we are looking to
predict the value of a variable, often times, we can get better predictions if we use more than one other
variable to make that prediction. That leads us to multiple regression. With familiarity with linear regression,
this would be a further pursuit.

Please consider the following example.

John, a small business owner for ABC Delivery Service, Inc (ABC DS) which offers same-day delivery for
letters, packages, and other small cargo. John is able to use Google Maps to group individual deliveries into
one trip to reduce time and fuel costs. Therefore some trips will have more than one delivery.
John would like to be able to estimate how long a delivery will take based on two factors: (1) the total
distance of the trip in miles and (2) the number of deliveries that must be made during the trip.
To conduct analysis, take a random sample of 10 past trips and record three pieces of information for
each trip: (1) total miles traveled, (2) number of deliveries, (3) total travel time in hours.

milesTraveled numDeliveries travelTime(hrs.)

(x1) (x2) (y)
89 4 7
66 1 5.4
78 3 6.6
111 6 7.4
44 1 4.8
77 3 6.4
80 3 7
66 2 5.6
109 5 7.3
76 3 6.4

In this case, remember you would like to be able to predict the total travel time using both the miles traveled
and number of deliveries on each trip.
In what way does travel time DEPEND on the first two measures?
Travel time is the dependent variable and miles traveled and number of deliveries are independent variables.

A. Multiple Regression
• Multiple Regression is an extension of simple linear regression. It is used in predicting the value of a
variable based on the value of two or more other variables.
• This investigates the relationship between two or more independent variables and a single dependent
variable

Dependent Variable - variable to predict (outcome, target, or the criterion variable)

Independent variable - the predictor, exploratory or regressor variables
• This allows you to determine the overall fit of the model and the relarive contribution of each of the
predictors to the total variance explained.
B. Relationships between the Variables

Simple Linear Regression:

IV DV
V

C. Multiple Linear Regression:

IV or more …

IV DV
V
IV

D. Assumptions
1. Dependent variables should be measured on a continuous scale (interval or ratio variable)
2. Two or more independent variables on a continuous or categorical
3. Independence of observations, which can be easily checked using the Durbin-Watson statistic
4. There must be a linear relationship between (a) dependent variable and each of the independent
variables and (b) the dependent variable and the independent variables collectively.
5. Data shows homoscedasticity, which is where the variances along the line of best fit remain similar as
you move along the line.
6. There should be no significant outliers, high leverage points or highly influential points.
7. The residual (errors) are approximately normally distributed.
8. Adding more independent variables to a multiple regression procedure does not mean the regression
will be better or offer better predictions; in fact it can make things worse. This is called overfitting.
9. The addition of more independent variables creates more relationships among them. SO not only are
the independent variables potentially related to the dependent variable, they are also potentially related
to each other. When this happens, it is called multicollinearity.
10. The ideal is for all the independent variables to be correlated with the dependent variable but not with
each other.
Because of multicollinearity ad overfitting, there is a fair amount of prep-work to do before
conducting multiple regression analysis if one is to it properly.
o Correlation
o Scatter Plots
o Simple Regression

E. In the previously mentioned example about 123Delivery Service, Inc.,

milesTraveled
(x1)
Multiple Regression
travelTime many-to-one
(y2)

numDeliveries
(x2)
F. Multiple Regression Model
Multiple Regression Model y = β1x1 + β2 x2 + β3 x3 +…….. βp xp
Linear parameter

Multiple Regression Equation E (y) = β0 + β1x1 + β2x2 + …….. βpxp

where bo, b1, b2 ….. bp are the estimates of Bo, B1,. B2. …… Bp

Estimated Multiple Regression Equation ŷ = b0 + b1 x1 + b2 x2 ……. bp xp

ŷ = predicted value of the independent variable

G. Estimated Multiple Regression Equation

Example: ŷ = 6.211 + 0.014x1 + 0.383x2 - 0.607x3
Estimated Multiple Regression ŷ = b0 + b1 x1 + b2 x2 ……. b3 x3
Equation
where bo, b1, b2 ….. bp are the estimates of Bo, B1,. B2. …… Bp
ŷ = predicted value of the dependent variable

H. Example:

John, a small business owner for ABC Delivery Service, Inc which offers same-day delivery for letters,
packages, and other small cargo. He is able to use Google Maps to group individual deliveries into one trip
to reduce time and fuel costs Therefore some trips will have more than one delivery. John would like to be
able to estimate how long a delivery will take based on three factors: (1) the total distance of the trip in miles,
(2) the number of deliveries that must be made during the trip, and (3) the daily price of gas/petrol in US
dollars.

Research question:
Are the three factors: (1) the total distance of the trip in miles, (2) the number of deliveries that must be
made during the trip, and (3) the daily price of gas/petrol in US dollars predictive of how long a delivery
will take?

I. Steps to consider before running the regression:

1. Generate a list of potential variables; independent(s) and dependent.
2. Collect data on the variables.
3. Check the relationships between each independent variable and the dependent variable using scatter
plots and correlations
4. Check the relationship among the independent variable using scatter plots and correlations.
5. (Optional) Conduct simple linear regression for each IV/DV pair.
6. Use the non-redundant independent variables in the analysis to find the best fitting model.
7. Use the best fitting model to make predictions about the dependent variables.

To conduct analysis, take a random sample of 10 past trips and record four pieces of information for
each trip: (1) total miles traveled (2)number of deliveries (3) daily price of gas (4) total time traveled in
hours
milesTraveled(x1) numDeliveries (x2) gasPrice (x3) travelTime(hrs)(y)
89 4 3.84 7
66 1 3.19 5.4
78 3 3.78 6.6
111 6 3.89 7.4
44 1 3.57 4.8
77 3 3.57 6.4
80 3 3.03 7
66 2 3.51 5.6
109 5 3.54 7.3
76 3 3.25 6.4

J. Sketching out relationships:

milesTraveled
(x1) Multiple Regression
many-to-one
travelTime
gasPrice
(y)
(x3)
numDeliveries
(x2)

6 relationships to analyze

K. IV to DV Scatterplots for Relevancy Check

Scatterplot of travelTime(y) vs milesTraveled(x1)

8
7.5
travelTime(y)

7
6.5
6
5.5
5
40 60 80 100 120
milesTraveled(x1)
Scatterplot of travelTime(hrs)(y) vs numDeliveries(x2)

8
travelTome(hrs)(y)

7.5
7
6.5
6
5.5
5
1 3 5 7
numDeliveries (x2)

Scatterplot of travelTime(hrs)(y) vs gasPrice(x 3)

8
7.5
travelTime

7
6.5
6
5.5
5
3 3.2 3.4 3.6 3.8 4
gasPrice(x3)

L. Scatterplot Summary

Dependent variable vs. independent variables

• travelTime(y) appears highly correlated with milesTraveled(x1)
• travelTime(y) appears highly correlated with numDeliveries(x 2)
• travelTime(y) DOES NOT appear highly correlated with gasPrice (x 3)
Since gasPrice(x3) DOES NOT APPEAR CORRELATED with the dependent variable, we would NOT
use that variable in the multiple regression

M. IV to DV Scatterplots for Multicollinearity Check

Scatterplot of numDeliveries(x2) vs milesTraveled(x1) = strong correlation

6
NumDeliveries(x3)
5
4
3
2
1
40 60 80 100 120
milesTraveled(x1)

Scatterplot of gasprice (x3) vs milesTraveled(x1) = no correlation

4
3.8
gasPrice(y)

3.6
3.4
3.2
3
40 60 80 100 120
milesTraveled(x1)

Scatterplot of gasPrice(x3) vs numDeliveries(x2) = no correlation

3.8
gasPrice(x3)

3.6

3.4

3.2

3
1 2 3 4 5 6
NumDeliveries(x2)

N. IV Scatterplot Summary

Independent variable vs. independent variable

• numDeliveries(x2) APPEARS highly correlated with milesTraveled(x1): this is
multicollinearity
• milesTraveled (x1) does not appear highky correlated with gasprice(x 3)
• gasPrice(x3) does not appear correlated with numDeliveries(x 2)

Since numDeliveries(x2) is HIGHLY CORRELATED with milesTraveled, we would NOT use BOTH
in the multiple regression; they are redundant
O. Correlations
milesTraveled(x1) numDeliveries(x2) gasPrice(x3)
numDeliveries(x2) r= 0.956 strong
p value =0.000 correlation
gasPrice(x3) r = 0.356 r = 0.498 no
p value = 0.313 p value = 0.143 correlation
travelTime(y) r= 0.928 r = 0.916 r = 0.267 no
p value = 0.000 p value = 0.000 p value = 0.455 correlation
p value (p < 0.5 statistically sig)

P. Correlation Summary

Correlation analysis confirms the conclusions reached by visual examination of the scatterplots
Redundant multicollinear variables
milesTraveled and numDeleiveries are both highly correlated with each other and therefore are
redundant’ only one should be used in the multiple regression analysis
Non – contributing variables
gasPrice is NOT correlated with the dependent variable and should be excluded

In conclusion:

In multiple regression, a lot of preparation work must be done.

Techniques used: scatterplots, correlation analysis, individual/group regressions

Electrical Installation and Maintenance: Quarter 1, Module 1, Week 2
100% (1)
Electrical Installation and Maintenance: Quarter 1, Module 1, Week 2
27 pages
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
No ratings yet
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
658 pages
International Economics I-1
No ratings yet
International Economics I-1
103 pages
Production Analysis PPT at BEC DOMS
No ratings yet
Production Analysis PPT at BEC DOMS
26 pages
Chapter 5 Violations of CLRM Assumptions
100% (2)
Chapter 5 Violations of CLRM Assumptions
25 pages
Econometrics II
100% (1)
Econometrics II
4 pages
Work Immersion and Work Ethics
No ratings yet
Work Immersion and Work Ethics
16 pages
Monetary Policy in India PDF
No ratings yet
Monetary Policy in India PDF
9 pages
Lecture 2 - Regression Model PDF
No ratings yet
Lecture 2 - Regression Model PDF
69 pages
Ibm SPSS PPT - Module 1
No ratings yet
Ibm SPSS PPT - Module 1
46 pages
Booklist and Supplementary Materials For Iss
No ratings yet
Booklist and Supplementary Materials For Iss
16 pages
Example of Article Review
No ratings yet
Example of Article Review
8 pages
MLR Multicollinearlty, Categorical Variable
No ratings yet
MLR Multicollinearlty, Categorical Variable
48 pages
Condominium
No ratings yet
Condominium
13 pages
Bernard Salanie The Economics of Taxation Contents
No ratings yet
Bernard Salanie The Economics of Taxation Contents
5 pages
Sampling Distribution and Estimation
No ratings yet
Sampling Distribution and Estimation
46 pages
Multivariate Analysis: Are Some of The Variables Dependent On Others?
100% (2)
Multivariate Analysis: Are Some of The Variables Dependent On Others?
16 pages
Cobb Douglas Production Function
No ratings yet
Cobb Douglas Production Function
10 pages
SEM 3 - Principles-of-Macroeconomics-I - AUG 20
100% (1)
SEM 3 - Principles-of-Macroeconomics-I - AUG 20
3 pages
Japan: Rinciples and General Objectives of Education
No ratings yet
Japan: Rinciples and General Objectives of Education
44 pages
Managerial Economics and Economist
No ratings yet
Managerial Economics and Economist
15 pages
Contrast Between Classical and Keynesian Economics
No ratings yet
Contrast Between Classical and Keynesian Economics
3 pages
Chi Square
No ratings yet
Chi Square
36 pages
Consumer and Producer Surplus
No ratings yet
Consumer and Producer Surplus
3 pages
Corporate Growth Maximization
No ratings yet
Corporate Growth Maximization
13 pages
Econometrics I: Dummy Variable Regression Models
No ratings yet
Econometrics I: Dummy Variable Regression Models
68 pages
Gauss-Markov Theorem
No ratings yet
Gauss-Markov Theorem
5 pages
STAT Report Multiple Regression Tabal A.
No ratings yet
STAT Report Multiple Regression Tabal A.
14 pages
Blended Learning - An Approach in Philippine Basic Education Curriculum in New Normal: A Review of Current Literature
No ratings yet
Blended Learning - An Approach in Philippine Basic Education Curriculum in New Normal: A Review of Current Literature
8 pages
Multicollinearity Among The Regressors Included in The Regression Model
No ratings yet
Multicollinearity Among The Regressors Included in The Regression Model
13 pages
Statistics Module 11
No ratings yet
Statistics Module 11
9 pages
Chapter 4 Multiple Regression Model
No ratings yet
Chapter 4 Multiple Regression Model
31 pages
BCOR 3750 Multiple Linear Regression Models
No ratings yet
BCOR 3750 Multiple Linear Regression Models
9 pages
Solutions Chapter 18
100% (5)
Solutions Chapter 18
12 pages
Class - 12th Business Studies Nature and Significance of Management
No ratings yet
Class - 12th Business Studies Nature and Significance of Management
68 pages
Primer of Applied Regression and Analysis of Variance 3rd Edition Glantz S.A. - Ebook PDF PDF Download
100% (3)
Primer of Applied Regression and Analysis of Variance 3rd Edition Glantz S.A. - Ebook PDF PDF Download
82 pages
Multiple Regression for SLU Students
No ratings yet
Multiple Regression for SLU Students
8 pages
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
100% (1)
Homoscedastic That Is, They All Have The Same Variance: Heteroscedasticity
11 pages
Arbitrage Pricing Theory
No ratings yet
Arbitrage Pricing Theory
8 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Fixed and Floating Exchange Rate
No ratings yet
Fixed and Floating Exchange Rate
16 pages
Class XII Financial Markets Exam
No ratings yet
Class XII Financial Markets Exam
8 pages
Joanne Kepher June 2nd 2021 PHD - Final Print
No ratings yet
Joanne Kepher June 2nd 2021 PHD - Final Print
307 pages
DickeyFullerTestUsingSPSS PDF
50% (2)
DickeyFullerTestUsingSPSS PDF
2 pages
Econometric S
No ratings yet
Econometric S
26 pages
QP Acc PB Xii 2024 25 Set 1
No ratings yet
QP Acc PB Xii 2024 25 Set 1
10 pages
Automobile Transmission: Manual and Automatic Transmission
No ratings yet
Automobile Transmission: Manual and Automatic Transmission
14 pages
MCQs
No ratings yet
MCQs
3 pages
Linear Regression for Researchers
No ratings yet
Linear Regression for Researchers
41 pages
Index Number Final
No ratings yet
Index Number Final
52 pages
Multiple Regression Insights
100% (1)
Multiple Regression Insights
29 pages
Chapter 1 - Introduction - Why Study Financial Markets and Institutions
No ratings yet
Chapter 1 - Introduction - Why Study Financial Markets and Institutions
23 pages
Investment-Approaches To Equity Analysis
No ratings yet
Investment-Approaches To Equity Analysis
47 pages
Questions Regarding Panel Data
100% (1)
Questions Regarding Panel Data
3 pages
Germany and Japan
No ratings yet
Germany and Japan
26 pages
Frota Et Al .2022 Ecohydrology
No ratings yet
Frota Et Al .2022 Ecohydrology
14 pages
Determination of Income and Employment
50% (2)
Determination of Income and Employment
2 pages
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
100% (1)
Chapters 1 & 2-Final - PPT Econmetrics - Smith/Watson
71 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Operations Research for Math Majors
No ratings yet
Operations Research for Math Majors
207 pages
Educ. 314
No ratings yet
Educ. 314
2 pages
Qualitative Response Regression Questions
No ratings yet
Qualitative Response Regression Questions
10 pages
Butts Artivle
No ratings yet
Butts Artivle
120 pages
Chapter Three: Estimation of Multiple Linear Regression Model
No ratings yet
Chapter Three: Estimation of Multiple Linear Regression Model
18 pages
Externality and Market Failure
No ratings yet
Externality and Market Failure
8 pages
Multiple Regression SPECIALISTICA
No ratings yet
Multiple Regression SPECIALISTICA
93 pages
Fisher Separation Lecture With Text Box
No ratings yet
Fisher Separation Lecture With Text Box
31 pages
Econometrics by Example 2nd Edition Damodar Gujarati Download PDF
100% (2)
Econometrics by Example 2nd Edition Damodar Gujarati Download PDF
41 pages
Data Science
No ratings yet
Data Science
5 pages
Sampling Distribution Basics
No ratings yet
Sampling Distribution Basics
48 pages
SEM 4 - 10 - BA-BSc - HONS - ECONOMICS - CC-10 - INTRODUCTORYECONOMETRI C - 10957
No ratings yet
SEM 4 - 10 - BA-BSc - HONS - ECONOMICS - CC-10 - INTRODUCTORYECONOMETRI C - 10957
3 pages
Effects of The Google Meet Assisted Method of Lear
No ratings yet
Effects of The Google Meet Assisted Method of Lear
13 pages
Homogeneous and Homothetic Functions PDF
No ratings yet
Homogeneous and Homothetic Functions PDF
8 pages
Investigating Students' Engagement in A Hybrid Learning Environment
No ratings yet
Investigating Students' Engagement in A Hybrid Learning Environment
8 pages
Can Technology Really Transform Education?: Challenge #1: Retaining Teacher Authenticity
No ratings yet
Can Technology Really Transform Education?: Challenge #1: Retaining Teacher Authenticity
7 pages
Econometrics Regression Insights
No ratings yet
Econometrics Regression Insights
20 pages
Oracle System Performance Forecasting
No ratings yet
Oracle System Performance Forecasting
12 pages
Statistics Definition and Data Collection Methods
No ratings yet
Statistics Definition and Data Collection Methods
7 pages
Chapter 1 Dummy Variable Regression
No ratings yet
Chapter 1 Dummy Variable Regression
45 pages
Brand Impact on E-Learning Loyalty
No ratings yet
Brand Impact on E-Learning Loyalty
8 pages
Chapter 4. Violation of Assumptions
No ratings yet
Chapter 4. Violation of Assumptions
51 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
13 pages
Chapter 3: Multiple Regression Analysis
No ratings yet
Chapter 3: Multiple Regression Analysis
12 pages
Stat331-Multiple Linear Regression
No ratings yet
Stat331-Multiple Linear Regression
13 pages
Use of Linear Regression For Time Series Prediction
No ratings yet
Use of Linear Regression For Time Series Prediction
38 pages
Quantitative Exam
No ratings yet
Quantitative Exam
33 pages
Bagozzi - 1984 - Expectancy-Value Attitude Models An Analysis of Critical Measurement Issues PDF
No ratings yet
Bagozzi - 1984 - Expectancy-Value Attitude Models An Analysis of Critical Measurement Issues PDF
16 pages
Bagheri 2018
No ratings yet
Bagheri 2018
18 pages
Auditor Industry Expertise and External Audit Prices Empirical Evidence From Amman Stock Exchange Listed Companies
No ratings yet
Auditor Industry Expertise and External Audit Prices Empirical Evidence From Amman Stock Exchange Listed Companies
14 pages
Afifah Annis - The Role of Family Support in The Self-Rated Health of Older Adults in Eastern Nepal: Findings From A Cross-Sectional Study (Q1)
No ratings yet
Afifah Annis - The Role of Family Support in The Self-Rated Health of Older Adults in Eastern Nepal: Findings From A Cross-Sectional Study (Q1)
11 pages
Educational Technology
No ratings yet
Educational Technology
12 pages
Problems Affecting Successful Implementation of Blended Learning
No ratings yet
Problems Affecting Successful Implementation of Blended Learning
11 pages
Electric Roller Shutter Drum Device
No ratings yet
Electric Roller Shutter Drum Device
11 pages
Eco 311 Module Test 2024 SE
No ratings yet
Eco 311 Module Test 2024 SE
9 pages
Capital Structure and Financial Sustainability: Stakes of Microfinance Institutions in Bamenda, Cameroon
No ratings yet
Capital Structure and Financial Sustainability: Stakes of Microfinance Institutions in Bamenda, Cameroon
10 pages
Shopee Users' Impulsive Buying Factors
No ratings yet
Shopee Users' Impulsive Buying Factors
10 pages
Murthy 2022
No ratings yet
Murthy 2022
9 pages
Sciencedirect: Categorical Principal Component Logistic Regression: A Case Study For Housing Loan Approval
No ratings yet
Sciencedirect: Categorical Principal Component Logistic Regression: A Case Study For Housing Loan Approval
7 pages
Narrative Report
No ratings yet
Narrative Report
2 pages
Here I Am Lord2
No ratings yet
Here I Am Lord2
1 page
Multipkle Regression
No ratings yet
Multipkle Regression
29 pages
1 ORSolution Manual Ch01
No ratings yet
1 ORSolution Manual Ch01
8 pages
Micro Formula Sheet
No ratings yet
Micro Formula Sheet
2 pages
The Nature and Scope of Econometrics: Confirming Pages
No ratings yet
The Nature and Scope of Econometrics: Confirming Pages
18 pages

Multiple Regression for Educators

Uploaded by

Multiple Regression for Educators

Uploaded by

Multiple Regression Analysis

A Report Presentation in EDUC 303 Data Management and Statistical Analysis

Dr. Giena L. Odicta

Mary Vincentia P. Olilang-Beldia

Please consider the following example.

milesTraveled numDeliveries travelTime(hrs.)

Dependent Variable - variable to predict (outcome, target, or the criterion variable)

Simple Linear Regression:

C. Multiple Linear Regression:

E. In the previously mentioned example about 123Delivery Service, Inc.,

Multiple Regression Equation E (y) = β0 + β1x1 + β2x2 + …….. βpxp

Estimated Multiple Regression Equation ŷ = b0 + b1 x1 + b2 x2 ……. bp xp

G. Estimated Multiple Regression Equation

I. Steps to consider before running the regression:

J. Sketching out relationships:

K. IV to DV Scatterplots for Relevancy Check

Scatterplot of travelTime(y) vs milesTraveled(x1)

Scatterplot of travelTime(hrs)(y) vs gasPrice(x 3)

Dependent variable vs. independent variables

M. IV to DV Scatterplots for Multicollinearity Check

Scatterplot of numDeliveries(x2) vs milesTraveled(x1) = strong correlation

Scatterplot of gasprice (x3) vs milesTraveled(x1) = no correlation

Scatterplot of gasPrice(x3) vs numDeliveries(x2) = no correlation

Independent variable vs. independent variable

In multiple regression, a lot of preparation work must be done.

You might also like