0% found this document useful (0 votes)

69 views24 pages

Chapter 8 Multiple Regression - Oct21

The document discusses multiple linear regression, which examines the relationship between more than one independent variable and a dependent variable. Multiple regression allows for more accurate prediction of the dependent variable compared to simple linear regression. It expresses relationships where a dependent variable may be influenced by multiple factors. The coefficient of determination, R2, indicates how well the regression model fits the observed data, with higher R2 indicating less unexplained variability. Both categorical and quantitative variables can be used as independent variables in a regression model.

Uploaded by

Nguyễn Ly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views24 pages

Chapter 8 Multiple Regression - Oct21

Uploaded by

Nguyễn Ly

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

10/26/2021

FORECASTING ENGINEERING

Chapter 7:

Multiple Regression

INSTRUCTOR:

• NGUYỄN VẠNG PHÚC NGUYÊN ([email protected])

HCMUT-Vietnam

Department of Industrial Systems Engineering

Simple linear regression vs. multiple linear regression

• Simple linear regression: the relationship between a
single independent variable and a dependent variable
(response: giá trị hồi đáp vs. predictor variable: biến
chỉ báo).
• Multiple linear regression: the relationship between
more than a single independent variable and a
dependent variable to predict its future values.
• More accurately predict the dependent variable
• Express real-life forecasting situations: one factor might be influenced
by multiple factors.

1
10/26/2021

Multiple Regression Analysis

Statistical Model between response Y and Independent variables
Xs’:

For the ith

observation set of
X1, X2, ...Xk we
have Xik and Yi

ε: deviations of the
estimated response
from the true
observed data
~ residuals

10/26/2021 Chapter 5_Regression Model - ISE Department 3

Multiple Regression Analysis

• The least squares criterion is used to develop this

equation.
• Because determining b1, b2, etc. is very tedious, a
software package such as Excel or MINITAB is
recommended.

2
10/26/2021

Regression Plane for a 2-Independent

Variable Linear Regression Equation

10/26/2021 5

Total
Variation

2 Explained Variation in Y
R =
Total Variation in Y

10/26/2021 Chapter 7_Regression Model - ISE Department 6

3
10/26/2021

Coefficient of Determination

For multiple regression:

• R2=1: all of the variability in Y is explained when X is

known: The sample data points all lie on the fitted
regression line
• R2=0: none of variability in Y is explained by X

10/26/2021 7

Independent Variables
• Qualitative variables
– Categorical: Categorical variables are also called qualitative
variables or attribute variables. Categorical variables (biến
phân loại/phân lớp) or nominal-scale variables (biến định danh)
or—such as
• gender (male and female).

– Ordinal: An ordinal variable (biến thứ tự) is similar to a categorical

variable.
• economic status (low, medium and high).
• educational experience (1, 2, 3 and 4 ~ elementary school, high school, some
college, and graduate)
• Likert scale (strongly agree”, “agree”, “neutral”, “disagree” and “strongly
disagree”). Ref:
https://www.extension.iastate.edu/Documents/ANR/LikertScaleExamplesforSurv
eys.pdf

10/26/2021 8

4
10/26/2021

– To use a qualitative variable in regression analysis, we use a

scheme of dummy variables in which one of the two possible
conditions is coded 0 and the other 1.
• Quantitative variables:
– The values of a quantitative variable are numbers that usually
represent a count or a measurement.
• Both categorical data and quantitative data: for
exploring a single subject. Categorical variables are
often used to group or subset the data in graphs or
analyses.

10/26/2021 9

Examples of categorical variables Examples of quantitative variables

Data type Examples Data type Examples

Numeric •Gender (1=Female, Numeric •Number of customer
2=Male) complaints
•Survey results (1=Agree, •Proportion of
2=Neutral, 3=Disagree) customers eligible for
a rebate
Text •Payment method (Cash or •Fill weight of a cereal
Credit)
box
•Machine settings (Low,
Medium, High) Date/time •Date and time
•Product types (Wood, payment is received
Plastic, Metal) •Date and time of
technical support
Date/time •Days of the week (Monday, incident
Tuesday, Wednesday)
•Months of the year
(January, February, March)

10/26/2021 10

5
10/26/2021

Coding for categorical variables

• To recode the categorical predictors, the following
methods aim to compare the levels of the predictor to
the overall mean or the mean of a reference level.
(-1, 0, +1): Choose to estimate the difference between each
level mean and the overall mean.
(1, 0): Choose to estimate the difference between each level
mean and the reference level's mean. If you choose the (1, 0)
coding scheme, the reference level table becomes active in the
dialog box.
• The coding scheme does not change the test of the
overall effect of the predictor.

10/26/2021 Chapter 5_Regression Model - ISE Department 11

This column of the table shows all the names of

the categorical predictors in your model. This
column does not take any input.
10/26/2021 12

6
10/26/2021

• For predictors with 1, 0 coding, by default, Minitab sets

the following reference levels based on the data type:
– For numeric categorical predictors, the reference level is the level with
the least numeric value.
– For date/time categorical predictors, the reference level is the level with
the earliest date/time.
– For text categorical predictors, the reference level is the level that is first
in value order, which is alphabetical order, by default.
• For predictors with -1, 0, 1 coding, by default, Minitab
sets the following reference levels based on the data
type:
– For numeric categorical predictors, the reference level is the level with the largest
numeric value.
– For date/time categorical predictors, the reference level is the level with the latest
date/time.
– For text categorical predictors, the reference level is the level that is last in
alphabetical order.
10/26/2021 Chapter 5_Regression Model - ISE Department 13

Multicollinearity
• The relation between X and Z or c in the previous example
called multicollinearity (đa cộng tuyến)
• Multicollinearity is the situation in which independent
variables in a multiple regression equation are highly
intercorrelated. That is, a linear relation exists between two or
more independent variables.
• Correlated independent variables make it difficult to make
inferences about the individual regression coefficients (slopes)
and their individual effects on the dependent variable (Y).
• However, correlated independent variables do not affect a
multiple regression equation’s ability to predict the dependent
variable (Y).


7
10/26/2021

Multicollinearity
• “How much multicollinearity in a regression analysis” is measured by
Variance Inflation Factor (hệ số phóng đại phương sai)

R2j: coefficient of determination

which is calculated by regressing
the jth IV on remaining
• (VIFj near 1) ~ R2j = 0 (k-1) IVs
 the jth IV DOES NOT related to the remaining IV(s)
 the coefficient of jth IV does not change when other IV(s) added or removed from
the model
• (VIFj > 1) ~ (1 > R2j > 0)
 the jth IV DOES related to the remaining IV(s)
 a large VIF make redundant information among predictor variables
 difficult to interpret the effects of IV(s) on the response
 solutions? (See Page 287).

• A VIF value greater than 5 suggests that the

regression coefficient is poorly estimated due to
severe multicollinearity.
VIF Status of predictor
VIF = 1 Not correlated
1 < VIF < 5 Moderately correlated
VIF > 5 Highly correlated

8
10/26/2021

Collinearity vs. Interaction

• Interaction terms can be added to the model to investigate if
two (or more) combined independent variables have effects on
the response Y.
 The interactions are commonly used when categorical factors are present
• Example: The rates of response of Y (income) to a second factor X2 (years
of education) according to the categorical X1 (gender, 0=male, 1= female).
• Y=β0+β1X1+β2X2, the model only accounts for females earning a fixed
amount more or less than males, with a separate term accounting for
educational difference regardless of gender.
• If we add a third interaction variable X1X2, so
Y=β0+β1X1+β2X2+β3(X1X2), this third factor will be zero for males,
but non-zero for females, thus representing the specific variable female
years of education, and allowing the model to separately account for the
effects this on income (β2 becomes the gradient for rate of change of male
income, and β3 is an adjustment to the slope of income change for
females).
This is an interaction, but it is not collinear


Collinearity vs. Interaction

• Example: The rates of response of Y (income) to a second factor X2
(years of education) according to the categorical X1 (gender, 0=male,
1= female).
Y (triệu) X1 X2
15 0 4
12 1 2
 Y=β0+β1X1+β2X2
• Male1: Y1=β0+β1X1+β2X2= β0+β2(4)
• Female1: Y2=β0+β1X1+β2X2=β0+β1(1)+β2(2)
 Y=β0+β1X1+β2X2+β3(X1X2)
• Male1: Y1=β0+ β2(4)
• Female1: Y2=β0+β1(1)+β2(2)+β3(2)
 This is an interaction, but it is not collinear

9
10/26/2021

Variance remedial measures

• If variances are not equal, it can have serious consequences for the
results of the ANOVA.

• If mild, the violation will not be that important.

• If significant, the violation can cause the ANOVA to be wrong or un-

interpretable.

• If variances are not equal:

– Check for outliers
– Transform the data (y)
• When variance is proportional to mean, use sqrt(y)
• When standard deviation is proportional to mean, use log(y)
• When std. dev is proportional to mean^2, use 1/y
• Read https://onlinecourses.science.psu.edu/stat501/node/318/ for more
details.

• See Example 7.1

• See Example Salary vs (X1, X2, X3, X4, X5, X6)

10
10/26/2021

What does this mean?

SSE  dft 
R 2 ( adj) = 1 −  =
SSR SSE SST  df e 
R2 = = 1−
SST SST SSE  df t  MSE
1−   = 1−
df e  SST  MST

10/26/2021 Chapter 7_Regression Model - ISE Department 28

What does this mean?

Regression DF =
# of predictors
s = MSE
Error DF =
difference
𝑆𝑆𝑅
𝑀𝑆𝑅
𝐷𝐹
𝑀𝑆𝑅
𝐹
𝑀𝑆𝐸

Total DF = # of
observations in
𝑆𝑆𝑇 𝑆𝑆𝑅 𝑆𝑆𝐸 the sample − 1
10/26/2021 Chapter 7_Regression Model - ISE Department 29

11
10/26/2021

What does this mean?

10/26/2021 Chapter 7_Regression Model - ISE Department 30

Systematic methods
• These p-values suggest that these coefficients in this
model might be zero, which makes one question whether
they should be in the model or not.
• DO NOT discard predictors based on their p-value. The
models are very sensitive to the inclusion or exclusion of
predictors. Use a systematic method to determine which
predictors to use.
• Best subsets: Compares all combinations of n predictors
and outputs the best.
• Stepwise: Takes the best predictor first, then adds (and
subtracts) others (See Excel file)

10/26/2021 Chapter 7_Regression Model - ISE Department 31

12
10/26/2021

Systematic methods
• Forward selection: Takes the best predictor, then adds more
one at a time (See Excel File)
• Backward elimination: Starts with all predictors, subtracts
them one at a time.
• What is the best model?
• It depends!
– You typically have a purpose in creating a regression model. It is
usually true that the regression is to have some cause-and-effect
relationship. Because of that, you want the model that accurately
reflects what x variables cause a change in the y variable, even if
that means increasing residuals
– Sometimes you want only the best predictors, sometimes you want
all the good predictors, sometimes you want a certain number of
predictors … and so on.
10/26/2021 Chapter 7_Regression Model - ISE Department 32

Systematic methods
• Stepwise regression procedures are controversial.
• Most of the time, they will give you similar answers.
• Forward elimination was used because it was the
simplest to perform, which was a consideration in the
1960s and 70s due to a lack of computing.
• Best subsets can be computationally intensive for large
models (ones with lots of predictors), but generally gives
the most defensible answer.
• When in doubt, use best subsets. If it takes too long to
run, use stepwise.

13
10/26/2021

Best subsets regression procedure

• Step #1. Identify all of the possible regression models
• Step #2. Determine the k-predictor models that do the
"best" at meeting some well-defined criteria (j=1, 2, 3...)
– Mallows' Cp-statistic (measure of unbiased model): Identify
subsets of predictors for which the Cp value is near p (if
possible) or smallest Cp value.
– more than one model has a small value of Cp value near p, in
general, choose the simpler model
– Large R2(adj), or large Predicted R2, and small s
• Step #3. Further evaluate and refine the handful of
models

10/26/2021 Chapter 7_Regression Model - ISE Department 34

The Assumptions of Multiple Regression

1. There is a linear relationship. That is, there is a straight-line
relationship between the dependent variable and the set of
independent variables.
2. The variation in the residuals is the same for both large and
small values of the estimated Y To put it another way, the
residual is unrelated whether the estimated Y is large or small.
3. The residuals follow the normal probability distribution.
4. The independent variables should not be correlated. That is,
we would like to select a set of independent variables that are
not themselves correlated.
5. The residuals are independent. This means that successive
observations of the dependent variable are not correlated. This
assumption is often violated when time is involved with the
sampled observations.

14
10/26/2021

Four Quick Checks (Multiple Regression)

1) Does the model make sense (i.e., check slope terms, F test)?
2) Is there a statistically significant relationship between the dependent and
independent variables (t-test)?
3) What percentage of the variation in the dependent variable does the
regression model explain (R-Square)?
4) Do the residuals violate assumptions (Analysis of Residuals)?
1. zero mean
2. normally distributed
3. homoscedastic (constant variance)
4. mutually independent (non-autocorrelated): Is there a problem of serial
correlation among the error terms in the model (Durbin-Watson)?

10/26/2021 36

Residual Plots for salary (x1000000)

Normal Probability Plot Versus Fits
99
2
90
1
Residual
Percent

50 0

-1
10
-2
1
-4 -2 0 2 4 12 14 16 18 20
Residual Fitted Value

Histogram Versus Order

6.0
2

4.5 1
Frequency

Residual

3.0 0

-1
1.5
-2
0.0
-2 -1 0 1 2 2 4 6 8 10 12 14 16 18 20
Residual Observation Order

10/26/2021 Chapter 7_Regression Model - ISE Department 37

15
10/26/2021

• The correlation between two variables can be visualized

by creating a scatterplot of the data: Y against X
– Not linear is detected transform X’ (of the variable X), Y’ (of the
variable Y), or both, can often significantly improve the
correlation
• A residual plot can reveal whether a data set follows a
random pattern
– No random is detected transform the raw data to make it more
linear significantly improve a fit between X and Y

10/26/2021 Chapter 7_Regression Model - ISE Department 38

the individual error

against the
predicted value

10/26/2021 Chapter 7_Regression Model - ISE Department 39

16
10/26/2021

10/26/2021 Chapter 7_Regression Model - ISE Department 40

10/26/2021 Chapter 7_Regression Model - ISE Department 41

17
10/26/2021

10/26/2021 Chapter 7_Regression Model - ISE Department 42

if the scatterplot of the raw data (X, Y) looks like that shown in
Figure (a),
 Transform (X, Y) to (X’, Y’) so that the scatterplot looks more
like that displayed in Figure (b).

10/26/2021 Chapter 7_Regression Model - ISE Department 43

18
10/26/2021

Figure (a) Raw Data (b) Transformed Data

Apply Exponential model: that means to apply a
logarithmic transformation to the dependent variable y as
shown in Figure (b)

10/26/2021 Chapter 7_Regression Model - ISE Department 44

the Quadratic model: If the trend in the data follows the

pattern shown in Figure (a), we could take the square root of
y to get y’=√y.

10/26/2021 Chapter 7_Regression Model - ISE Department 45

19
10/26/2021

the Reciprocal model: A trend in the raw data as

shown in Figure (a) would suggest a reciprocal
transformation, i.e. y’=1/y.

10/26/2021 Chapter 7_Regression Model - ISE Department 46

the Logarithmic model: If the raw data follows a trend as

shown in Figure (a), a logarithmic transformation can be
applied to the independent variable x: x’ = log(x) ; a
logarithmic transformation can be applied to the
independent variable y: y’ = log(y) when Figure (a) has the
opposite values of x vs y

10/26/2021 Chapter 7_Regression Model - ISE Department 47

20
10/26/2021

• The Box-Cox transformation constitutes another

particularly useful family of transformations, which is
applied to the independent variable in most cases.

– T(X) = log(X), if λ = 0
– Where X is the variable being transformed and λ is referred to as
the transformation parameter.
• The optimal value of λ is then the value of λ
corresponding to the maximum correlation
• The Box-Cox transformation can also be applied to the Y
variable

10/26/2021 Chapter 7_Regression Model - ISE Department 48

Letting Minitab
calculate the optimal
lambda should
produce the best-
fitting results.

10/26/2021 Chapter 7_Regression Model - ISE Department 49

21
10/26/2021

Durbin-Watson tests
• Durbin-Watson tests for autocorrelation in
residuals from a regression analysis.
• Method 1:
– The test statistic ranges in between 0 to 4.
– A value of 2 indicates that there is no autocorrelation.
Value nearing 0 (i.e., below 2) indicates positive
autocorrelation and value towards 4 (i.e., over 2)
indicates negative autocorrelation.
– We could reduce the Dw value by increasing your
sample size

10/26/2021 50

• Durbin-Waston test is based on the assumption that the

errors in the regression model are generated by a ﬁrst-
order autoregressive process observed at equally
spaced time periods, that is,

εt = ρεt−1 + at
where εt is the error term in the model at time period t, at is an
NID(0, σ2 a) random variable, and ρ(|ρ| < 1) is the autocorrelation
parameter.
• A simple linear regression model with ﬁrst-order
autoregressive errors
yt = β0 + β1xt + εt
εt = ρεt−1 + at
10/26/2021 51

22
10/26/2021

• Most regression problems involving time series

data exhibit positive autocorrelation, the
hypotheses usually considered in the Durbin-
Watson test are
H0 : ρ = 0
H1 : ρ > 0
– If d < dL reject H0 : ρ = 0
– If d > dU do not reject H0 : ρ = 0
– If dL < d < dU test is inconclusive.
10/26/2021 52

10/26/2021 53

23
10/26/2021

10/26/2021 54

Assignment

10/26/2021 55

Multiple Regression: by Dr. D. Israel
No ratings yet
Multiple Regression: by Dr. D. Israel
23 pages
DS 3 2
No ratings yet
DS 3 2
17 pages
Denise Dailyroutine
No ratings yet
Denise Dailyroutine
10 pages
Chapter 0 - Multiple Regression Models
No ratings yet
Chapter 0 - Multiple Regression Models
34 pages
Lecture3 4
No ratings yet
Lecture3 4
48 pages
Chapter 8 Linear Regression
No ratings yet
Chapter 8 Linear Regression
34 pages
83 Revision Questions For IGCSE Questions Solutions PDF
100% (4)
83 Revision Questions For IGCSE Questions Solutions PDF
5 pages
1 Multicollinearity and Partial F Test PowerPoint
No ratings yet
1 Multicollinearity and Partial F Test PowerPoint
61 pages
Violation of Assumptions2
No ratings yet
Violation of Assumptions2
21 pages
Consequences of Multicollinearity
100% (2)
Consequences of Multicollinearity
2 pages
Etrics Chap3
No ratings yet
Etrics Chap3
34 pages
Econometrics For MGT ppt-2
No ratings yet
Econometrics For MGT ppt-2
58 pages
Multiple Regression for Students
100% (2)
Multiple Regression for Students
105 pages
Intermediate Analytics-Regression-Week 1
No ratings yet
Intermediate Analytics-Regression-Week 1
52 pages
Kaldor'S Growth Theory Nancy J. Wulwick
No ratings yet
Kaldor'S Growth Theory Nancy J. Wulwick
19 pages
Multiple Regression Explained
100% (2)
Multiple Regression Explained
23 pages
Lecture 6 Multicollinearity
No ratings yet
Lecture 6 Multicollinearity
25 pages
Solutions Ch08 4e Probs01 14
No ratings yet
Solutions Ch08 4e Probs01 14
20 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Unit 5 Business Analytics
No ratings yet
Unit 5 Business Analytics
24 pages
CH 6
No ratings yet
CH 6
43 pages
Chapter11 Regression
No ratings yet
Chapter11 Regression
37 pages
Multicollinearity Among The Regressors Included in The Regression Model
No ratings yet
Multicollinearity Among The Regressors Included in The Regression Model
13 pages
Topic 7 Regression (Cont.)
No ratings yet
Topic 7 Regression (Cont.)
47 pages
Correlation & Regression Analysis Guide
No ratings yet
Correlation & Regression Analysis Guide
49 pages
Handout 4 Multiple Regression
No ratings yet
Handout 4 Multiple Regression
2 pages
All RE
100% (3)
All RE
98 pages
Regression PDF
No ratings yet
Regression PDF
7 pages
Sa1 Frame
No ratings yet
Sa1 Frame
51 pages
Statistical Models for Analysts
No ratings yet
Statistical Models for Analysts
93 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
16 pages
ML Module3 Regression
No ratings yet
ML Module3 Regression
51 pages
CH 6
No ratings yet
CH 6
42 pages
Econometrics 2
No ratings yet
Econometrics 2
27 pages
FYP Proposal (Tank Wall Crawler Robot)
0% (1)
FYP Proposal (Tank Wall Crawler Robot)
10 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Linear Regression Analysis - 1
No ratings yet
Linear Regression Analysis - 1
18 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
Lecture 4 - Multicolinearity
No ratings yet
Lecture 4 - Multicolinearity
24 pages
Regression Packet
No ratings yet
Regression Packet
27 pages
Unit 4-1
No ratings yet
Unit 4-1
29 pages
Anova Explain
No ratings yet
Anova Explain
10 pages
Lecture 10
No ratings yet
Lecture 10
5 pages
Cha 6
No ratings yet
Cha 6
8 pages
Linear Regression 1
No ratings yet
Linear Regression 1
14 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
48 pages
Joseph Matthews - The Renegade Rapport
No ratings yet
Joseph Matthews - The Renegade Rapport
21 pages
3-Linear Regreesion-Assumptions
No ratings yet
3-Linear Regreesion-Assumptions
28 pages
Multiple Regression Analysis 1
No ratings yet
Multiple Regression Analysis 1
57 pages
04 Violation of Assumptions All
No ratings yet
04 Violation of Assumptions All
24 pages
CH 5
No ratings yet
CH 5
36 pages
Econ. Assignment
No ratings yet
Econ. Assignment
6 pages
Chapter 4 Multicollinearity
No ratings yet
Chapter 4 Multicollinearity
7 pages
Aiml M3 C3
No ratings yet
Aiml M3 C3
37 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Note 13 - Linear Regression
No ratings yet
Note 13 - Linear Regression
25 pages
Prediction & Forecasting: Regression Analysis
No ratings yet
Prediction & Forecasting: Regression Analysis
3 pages
Elemental Battle Armor (AP Gauss) (Sqd6)
No ratings yet
Elemental Battle Armor (AP Gauss) (Sqd6)
1 page
Biv Mult
No ratings yet
Biv Mult
18 pages
Mark312 - Regression Choosing The Right Tools
No ratings yet
Mark312 - Regression Choosing The Right Tools
7 pages
Session 12: Regression, Forecasting Techniques: January 2015 - April 2015
No ratings yet
Session 12: Regression, Forecasting Techniques: January 2015 - April 2015
9 pages
2.catalouge With Certificate of Smoke Detector
No ratings yet
2.catalouge With Certificate of Smoke Detector
10 pages
RTU Specification for SCADA Systems
100% (1)
RTU Specification for SCADA Systems
18 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Regression
No ratings yet
Regression
25 pages
Linear Regression
No ratings yet
Linear Regression
10 pages
Multicollinearity and Endogeneity PDF
No ratings yet
Multicollinearity and Endogeneity PDF
37 pages
Calculus & Algebra for Engineers
No ratings yet
Calculus & Algebra for Engineers
2 pages
Bộ đề kiểm tra định kì - lớp 6 - global success
No ratings yet
Bộ đề kiểm tra định kì - lớp 6 - global success
38 pages
The Practice of Ecological Art Sacha KAGAN, Institute of Sociology 2014
No ratings yet
The Practice of Ecological Art Sacha KAGAN, Institute of Sociology 2014
7 pages
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
No ratings yet
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
5 pages
Law Firm Questions
No ratings yet
Law Firm Questions
5 pages
EIL Participates in India Energy Week 2024
No ratings yet
EIL Participates in India Energy Week 2024
9 pages
Multiple Linear Regression: y BX BX BX
No ratings yet
Multiple Linear Regression: y BX BX BX
14 pages
Associations Between Social Responsibility Disclosure and Characteristics of Companies
No ratings yet
Associations Between Social Responsibility Disclosure and Characteristics of Companies
8 pages
PDF No Bake Asweseeit - Compress
No ratings yet
PDF No Bake Asweseeit - Compress
132 pages
Generative Ai-In-The-Loop: Integrating Llms and Gpts Into The Next Generation Networks
No ratings yet
Generative Ai-In-The-Loop: Integrating Llms and Gpts Into The Next Generation Networks
9 pages
Data Analysis: Frequency Tables
No ratings yet
Data Analysis: Frequency Tables
1 page
6EP1332-1SH31 - Industry Support Siemens
No ratings yet
6EP1332-1SH31 - Industry Support Siemens
3 pages
Gingrich 2016
No ratings yet
Gingrich 2016
5 pages
Experiment Explanation - Grade 7
No ratings yet
Experiment Explanation - Grade 7
5 pages
DataTables Export Guide
No ratings yet
DataTables Export Guide
2 pages
English MAINS Practice Shot 200
No ratings yet
English MAINS Practice Shot 200
4 pages
Chapter 10 - Managing Forecasting Process and Finals Review - Dec20
No ratings yet
Chapter 10 - Managing Forecasting Process and Finals Review - Dec20
4 pages
Rizal Paris To Berlin
No ratings yet
Rizal Paris To Berlin
14 pages
CHE486 - EXPERIMENT 7 (Film Boiling Condensation) UiTM
No ratings yet
CHE486 - EXPERIMENT 7 (Film Boiling Condensation) UiTM
11 pages
Chapter 10 - (II) Judgement Forecast and Forecast Adjustment - Dec20
No ratings yet
Chapter 10 - (II) Judgement Forecast and Forecast Adjustment - Dec20
6 pages
OB - Product Design - Eng
No ratings yet
OB - Product Design - Eng
29 pages
Chapter 7 (I) Correlation and Regression Model - Oct21
No ratings yet
Chapter 7 (I) Correlation and Regression Model - Oct21
23 pages
Consciousness Study: Three Paradigms
No ratings yet
Consciousness Study: Three Paradigms
11 pages
Chương 3. Phân Tích RQĐ
No ratings yet
Chương 3. Phân Tích RQĐ
38 pages
Agisoft Metashape Updates
No ratings yet
Agisoft Metashape Updates
41 pages

Chapter 8 Multiple Regression - Oct21

Uploaded by

Chapter 8 Multiple Regression - Oct21

Uploaded by

10/26/2021

• NGUYỄN VẠNG PHÚC NGUYÊN ([email protected])

Department of Industrial Systems Engineering

Simple linear regression vs. multiple linear regression

Multiple Regression Analysis

For the ith

10/26/2021 Chapter 5_Regression Model - ISE Department 3

Multiple Regression Analysis

• The least squares criterion is used to develop this

Regression Plane for a 2-Independent

10/26/2021 Chapter 7_Regression Model - ISE Department 6

For multiple regression:

• R2=1: all of the variability in Y is explained when X is

– Ordinal: An ordinal variable (biến thứ tự) is similar to a categorical

– To use a qualitative variable in regression analysis, we use a

Examples of categorical variables Examples of quantitative variables

Data type Examples Data type Examples

Coding for categorical variables

10/26/2021 Chapter 5_Regression Model - ISE Department 11

This column of the table shows all the names of

• For predictors with 1, 0 coding, by default, Minitab sets

R2j: coefficient of determination

• A VIF value greater than 5 suggests that the

Collinearity vs. Interaction

Collinearity vs. Interaction

Variance remedial measures

• If mild, the violation will not be that important.

• If significant, the violation can cause the ANOVA to be wrong or un-

• If variances are not equal:

• See Example 7.1

What does this mean?

10/26/2021 Chapter 7_Regression Model - ISE Department 28

What does this mean?

What does this mean?

10/26/2021 Chapter 7_Regression Model - ISE Department 30

10/26/2021 Chapter 7_Regression Model - ISE Department 31

Best subsets regression procedure

10/26/2021 Chapter 7_Regression Model - ISE Department 34

The Assumptions of Multiple Regression

Four Quick Checks (Multiple Regression)

Residual Plots for salary (x1000000)

Histogram Versus Order

10/26/2021 Chapter 7_Regression Model - ISE Department 37

• The correlation between two variables can be visualized

10/26/2021 Chapter 7_Regression Model - ISE Department 38

the individual error

10/26/2021 Chapter 7_Regression Model - ISE Department 39

10/26/2021 Chapter 7_Regression Model - ISE Department 40

10/26/2021 Chapter 7_Regression Model - ISE Department 41

10/26/2021 Chapter 7_Regression Model - ISE Department 42

10/26/2021 Chapter 7_Regression Model - ISE Department 43

Figure (a) Raw Data (b) Transformed Data

10/26/2021 Chapter 7_Regression Model - ISE Department 44

the Quadratic model: If the trend in the data follows the

10/26/2021 Chapter 7_Regression Model - ISE Department 45

the Reciprocal model: A trend in the raw data as

10/26/2021 Chapter 7_Regression Model - ISE Department 46

the Logarithmic model: If the raw data follows a trend as

10/26/2021 Chapter 7_Regression Model - ISE Department 47

• The Box-Cox transformation constitutes another

10/26/2021 Chapter 7_Regression Model - ISE Department 48

10/26/2021 Chapter 7_Regression Model - ISE Department 49

• Durbin-Waston test is based on the assumption that the

• Most regression problems involving time series

You might also like