0% found this document useful (0 votes)

156 views41 pages

Excel Regression

The document discusses spreadsheet problem solving techniques including fitting linear, multilinear, polynomial, and nonlinear regression models to data. It provides examples of using Excel's Data Analysis Regression tool and Trendline feature to analyze straight-line, polynomial, and multi-linear regression models. It also discusses using the Solver tool to perform nonlinear regression to fit parameters in the van der Waals equation of state to pressure-volume data.

Uploaded by

Steve Wan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

156 views41 pages

Excel Regression

Uploaded by

Steve Wan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 41

Spreadsheet Problem Solving

fitting models to data

straight-line regression
multilinear regression
nonlinear regression
model building and selection
Data Analysis Regression tool
using
Trendline
Solver

Review of Straight-line Linear Regression

[ from Class #6 ]
y1

y = ax + b
Model

y
y11
e11

y11
x

x11

For each data point, there is an error between that

point and the model line. Fitting the model has to do
with minimizing these errors.

Finding the model parameters that give the best fit

For the straight-line model, the model parameters are
the slope (a) and the intercept (b).
The problem is then to find the values of a and b that
give the best fit. What is meant by the best fit?
The standard measure of goodness of fit is the sum
of squares of the errors:
n

SSE yi yi
i 1

yi a xi b

So, the problem reduces to finding the minimum of

SSE by adjusting a and b.

Fitting a straight-line model to data

The minimization of SSE can be solved by calculus
to give formulas for the best values of a and b:

n xi yi xi yi
i 1 i 1
a i 1
2
n
n

2
n xi xi
i 1
i 1
n

y
i 1

x
i 1

and Excel solves problems like this with either formulas

or built-in tools (Data Analysis Regression & Trendline).
4

Example: straight-line fit

Transfer the data to an Excel spreadsheet

and create a graph

CO2 Emissions for the US

1520
1500
1480

CO2 Emissions (MMT C)

1460
1440
1420
1400
1380
1360
1340
1320
1989

1990

1991

1992

1993

1994

1995
Year

1996

1997

1998

1999

2000

Calculating the slope and intercept using Excel formulas

n xi yi xi yi
i 1 i 1
a i 1
2
n
n

n xi2 xi
i 1
i 1
n

y
i 1

x
i 1

The formulas behind the numbers

Using the model straight-line equation to compute

the predictions:

and copy these

to the graph,
displaying as
a straight line

CO2 Emissions for the US

1550

CO2 Emissions (MMT C)

1500

y = 21.32x - 41090
1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

Using an alternate, shortcut approach

Trendline

Start with a simple graph of the data

Select the data series by
clicking on it
CO2 Emissions for the US
1520
1500
1480

Select
Add Trendline
option

1460
CO2 Emissions (MMT C)

Right-click on a
data point to get
context-sensitive
menu

1440
1420
1400
1380
1360
1340
1320
1989

1990

1991

1992

1993

1994

1995
Year

1996

1997

1998

1999

2000

The Add Trendline dialog box

Linear selected
by default
OK for this
problem
Click on
Options tab

Options tab

Set for
Display equation
on chart

Click OK
13

Fix up
equation
display

Initial form of graph with straight-line added

CO2 Emissions for the US
1550

y = 21.315x - 41090

CO2 Emissions (MMT C)

1500

1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

CO2 Emissions for the US

1550

CO2 Emissions (MMT C)

1500

y = 21.315x - 41090

1450

1400

1350

1300
1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

Year

Looks just like before, but we got there quicker

But neither of these approaches gives us much information
15
about the model, how good it is, etc.

A 2nd alternate approach

Tools
Data Analysis

Data Analysis Regression tool

recall that, if Data Analysis
does not appear on the Tools
menu, you will need to check
Analysis Toolpak in the Add-ins
dialog box [if its not there, you
will have to go back to Microsoft
Office/Excel set-up]

Initial, empty
Regression
dialog box

Regression dialog box set up for our problem

checking Residuals
will give us also
model predictions
17

Initial (poorly formatted) Regression output display

[ on new worksheet ]

Format
Autoformat
OK
and fix up
display for
appropriate
significant
figures
18

Final Display of Regression Output

[ tons of info, most of
which you will not
understand for a
couple years ]

used to judge
goodness of
fit
intercept
and slope
values

used to judge
whether terms
belong in the
model
add to data graph
for visual comparison
with model

Judging Goodness of Fit

correlation coefficient: if close

to +1 or 1, indicates strong
correlation between x and y
[something we already know
from the original graph!]
coefficient of determination:
%-age of the variability in y
thats accounted for by the
model

gives an idea of how

far off the model
predictions will be

adjustment to R2 that
penalizes the value for
using a model with too
many terms

Adjusted R2 or Standard Error can be used to compare

different models and choose which fits best. The higher
the value of Adjusted R2 the better, the lower the value
of Standard Error the better.
20

Judging whether terms belong in the model

P-values estimate the probability
that the true value of the coefficient
could be zero

A P-value of 5%
(0.05) or greater
causes suspicion
that the coefficient
may not be
significant and that
the term should
probably be dropped
from the model

P-values that are quite small, like

these, indicate that there is little
question about the significance of
the term coefficients. In our case
here, that means that both the
intercept term and the slope term
belong in the model.

The Data Analysis Regression tool appears much more

complicated and involved that the shortcut Trendline tool, so . . .
Why use Data Analysis Regression?
1) It provides more information that lets us
judge the goodness of fit and significance
of model terms
2) It can handle model forms that cannot be
handled by Trendline
So, generally, when using Excel, we prefer
the Data Analysis Regression tool over Trendline
but Trendline is still quite good for quick and dirty
looks at the data
Learn to use both!

More complicated models

Polynomial models

y a bx cx 2 dx 3 L

Note: it is called linear regression,

even when there are nonlinear
terms in x, because the terms are
linear in the model parameters,
a, b, c, etc.

General linear models

y a f1 x b f 2 x c f 3 x d f 4 x L
Examples:

polynomial models above

1
y a b c ln x
x
Multilinear models

y a f1 x1 ,x2 ,K b f 2 x1 ,x2 ,K c f 3 x1 ,x2 ,K L

Examples:

y a bx1 cx2 dx1 x2

y ae

x1
x2
23

Nonlinear models
Transformable to linear

ln y ln a b x

y a eb x
Not transformable

P 10

B
T C

straight-line
regression!

We can use the Data Analysis Regression tool for everything

except the nonlinear models that cant be transformed into
linear. For those, we can use the Solver.

Example: polynomial regression

curvature evident

Viscosity of Water at Atmospheric Pressure

2.000
1.800
1.600

Viscosity (cp)

1.400
1.200
1.000
0.800
0.600
0.400
0.200
0.000
0

100

150

200

250

Temperature (degF)

Setting up for polynomial fits

Select for quadratic model, etc

Data Analysis Regression tool

check Labels because

headings are included
in selections for Y and X

check
Residuals

Quadratic model regression results

model performance
adjR2

model coefficients
copy to graph

Quadratic model really doesnt capture behavior of data

Viscosity of Water at Atmospheric Pressure
2.000
1.800
1.600
Data

Viscosity (cp)

1.400

Quadratic

1.200
1.000
0.800
0.600
0.400
0.200
0.000
0

100

150

200

250

Temperature (degF)

Continue with fits of cubic, 4th- & 5th-order polynomials

Summary of results

Looks like 5th-order offers best performance

but improvement is marginal over 4th-order.
Resulting model:
Visc 3.161 0.05699 T 5.023 10 4 T 2 2.162 10 6 T 3 3.593 10 9 T 4

Viscosity of Water at Atmospheric Pressure

2.000
1.800
1.600
Data

Viscosity (cp)

1.400

Quadratic
Cubic

1.200

4th Order
1.000
0.800
0.600
0.400
0.200
0.000
20

100

120

140

160

180

200

220

Temperature (degF)

Precautions on polynomial fitting

Try to use the lowest-order model that gives a good fit.
Higher-order models will have wiggles between data
points that will cause prediction errors.
In fact, an (n-1)th-order polynomial will provide a perfect
fit to the n data points, but it will usually do bizarre things
in between the data points.

Example: multi-linear regression

Model 1: y a b x1 c x2

Model 2:

y b x1 c x2

X-input range includes

two independent variables:
x1 and x2
High P value for intercept in
Model 1 suggests Model 2
without intercept, but there
is a significant loss in adjR2

Multilinear Model Performance

12.0

Model performance isnt that

great for either model, and
Model 1 doesnt appear
dramatically better than Model 2

10.0

Predicted y

8.0

Model 1

6.0

Model 2

4.0

2.0

0.0
0

Measured y

Note: for multi-linear models, we plot Predicted vs Measured y.

A perfect model would place points directly on the 45-degree line.

Nonlinear Regression
Fitting the parameters of the van der Waals equation of state
Data for SO2
RT
a

V b V 2

Find the values of a and b

that give the best predictions
for P, when compared to the
measured values of P

Strategy for Nonlinear Regression

1) estimate initial values for a and b
2) compute predicted Ps using data for V and T
3) compute errors between predicted Ps and measured Ps
4) sum the squares of these errors to compute SSE
5) have the Solver minimize SSE
by adjusting the values of a and b

Basic data

Calculated Pressure

by both ideal gas law

and van der Waals
Sum of
squares
of this
column

Ideal Gas
Sum of Squares
Calculation Calculation

van der Waals Calculation

Error Calculation

Setting up Solver Parameters

SSE as Target Cell
Minimize
by adjusting a and b
with b>=0 constraint

Results

Fit of van der Waals Eqn for SO2

and Comparison to Ideal Gas Law
12000000

Note departure of
ideal gas predictions
at higher pressures

Predicted Pressure (Pa)

10000000

8000000
van der Waals
Ideal Gas

6000000

4000000

2000000

0
0

2000000

4000000

6000000

8000000

10000000

12000000

Measured Pressure (Pa)

Trig Cheat Sheet 1.4
67% (3)
Trig Cheat Sheet 1.4
2 pages
Pump Sizing Calculation Sheet
100% (2)
Pump Sizing Calculation Sheet
10 pages
Mode Linear Regression SQL
100% (1)
Mode Linear Regression SQL
21 pages
Electronic Reverse Auction and The Public Sector: Factors of Success Moshe E. Shalev & Stee Asbjorensen
100% (3)
Electronic Reverse Auction and The Public Sector: Factors of Success Moshe E. Shalev & Stee Asbjorensen
25 pages
CTan UserManual
No ratings yet
CTan UserManual
116 pages
Business Calculations & Concepts
100% (1)
Business Calculations & Concepts
3 pages
Hays Statistics 5th Edition: Error Corrections
No ratings yet
Hays Statistics 5th Edition: Error Corrections
3 pages
Arbitrage Project
No ratings yet
Arbitrage Project
96 pages
Data Analysis
No ratings yet
Data Analysis
30 pages
Regression Analysis Using SPSS: DR Somesh K Sinha
100% (1)
Regression Analysis Using SPSS: DR Somesh K Sinha
17 pages
Approaches To The Analysis of Survey Data PDF
No ratings yet
Approaches To The Analysis of Survey Data PDF
28 pages
Artificial Intelligence Traffic Prediction
No ratings yet
Artificial Intelligence Traffic Prediction
11 pages
Recommender System Algorithms
No ratings yet
Recommender System Algorithms
4 pages
POL BigDataStatisticsJune2014
No ratings yet
POL BigDataStatisticsJune2014
27 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
Necessity of Digital Transformation in RMG Sector in Bangladesh
No ratings yet
Necessity of Digital Transformation in RMG Sector in Bangladesh
14 pages
IoT Workshop for Educators & Industry
No ratings yet
IoT Workshop for Educators & Industry
2 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
11 pages
Real Time Object Detection Using Deep Learning Andmachine Learning Project
No ratings yet
Real Time Object Detection Using Deep Learning Andmachine Learning Project
56 pages
Data Science & Business Analytics: Post Graduate Program in
No ratings yet
Data Science & Business Analytics: Post Graduate Program in
17 pages
What Is Software Engineering?
No ratings yet
What Is Software Engineering?
2 pages
Karanja Evanson Mwangi Cit Masters Report Libre PDF
No ratings yet
Karanja Evanson Mwangi Cit Masters Report Libre PDF
136 pages
5-A Novel Multi-Objective Evolutionary Algorithm For Recommendation Systems
No ratings yet
5-A Novel Multi-Objective Evolutionary Algorithm For Recommendation Systems
11 pages
Document-Oriented Database - Wikipedia PDF
No ratings yet
Document-Oriented Database - Wikipedia PDF
10 pages
Anomaly Detection
No ratings yet
Anomaly Detection
11 pages
Starbucks Sentiment Analysis Using VADER
No ratings yet
Starbucks Sentiment Analysis Using VADER
23 pages
Efficient Frontier
No ratings yet
Efficient Frontier
27 pages
A Review of Business Intelligence and Analytics in Small and Medium Sized Enterprises
No ratings yet
A Review of Business Intelligence and Analytics in Small and Medium Sized Enterprises
24 pages
Sat - 63.Pdf - Crime Detction Using Machine Learning
No ratings yet
Sat - 63.Pdf - Crime Detction Using Machine Learning
11 pages
Software Test Plan For Automated Ticket Issuing System For Dhaka Subway Systems PDF
No ratings yet
Software Test Plan For Automated Ticket Issuing System For Dhaka Subway Systems PDF
16 pages
AI in Civil Engineering
No ratings yet
AI in Civil Engineering
3 pages
Digital Analytics Maturation Model
No ratings yet
Digital Analytics Maturation Model
3 pages
Image Sorting for Tech Enthusiasts
No ratings yet
Image Sorting for Tech Enthusiasts
6 pages
Trend Analysis
No ratings yet
Trend Analysis
27 pages
Decision Map PDF
No ratings yet
Decision Map PDF
1 page
Tableau Tutorial For Beginners
No ratings yet
Tableau Tutorial For Beginners
8 pages
CSE-Machine Learning & Big Data - WSS Source Book
No ratings yet
CSE-Machine Learning & Big Data - WSS Source Book
181 pages
Weka A Tool For Exploratory Data Mining
No ratings yet
Weka A Tool For Exploratory Data Mining
157 pages
Hotel Booking Prediction Using Machine Learning
No ratings yet
Hotel Booking Prediction Using Machine Learning
5 pages
1 Interim Report
No ratings yet
1 Interim Report
29 pages
Excel Linear Regression Guide
No ratings yet
Excel Linear Regression Guide
8 pages
Robust Statistics - How Not To Reject Outliers
100% (1)
Robust Statistics - How Not To Reject Outliers
5 pages
Text Mining Handbook
No ratings yet
Text Mining Handbook
61 pages
Simulation
No ratings yet
Simulation
63 pages
Capers Estimation
No ratings yet
Capers Estimation
28 pages
Business Intelligence On The Cloud PDF
No ratings yet
Business Intelligence On The Cloud PDF
15 pages
Commodity Trading Transformation
No ratings yet
Commodity Trading Transformation
5 pages
Thesis Anum Afzal
No ratings yet
Thesis Anum Afzal
127 pages
DataMining S
No ratings yet
DataMining S
103 pages
Segmentation of Industrial Markets - Nested Approach
100% (1)
Segmentation of Industrial Markets - Nested Approach
35 pages
Revision Module 1,2,3
No ratings yet
Revision Module 1,2,3
129 pages
BALA202 Lecture 1
No ratings yet
BALA202 Lecture 1
52 pages
BC5901 SLM Unit 01
No ratings yet
BC5901 SLM Unit 01
23 pages
Text and Sentiment Analysis
No ratings yet
Text and Sentiment Analysis
41 pages
IE-411 Assignment No.1 Team No.1
No ratings yet
IE-411 Assignment No.1 Team No.1
18 pages
Usability Testing
No ratings yet
Usability Testing
32 pages
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
No ratings yet
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
4 pages
Gen AI For Developers Preread
No ratings yet
Gen AI For Developers Preread
96 pages
A Software Requirements Specification
No ratings yet
A Software Requirements Specification
12 pages
Financial Analytics Product Guide
No ratings yet
Financial Analytics Product Guide
168 pages
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
No ratings yet
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
89 pages
CH817 Lecture 02 2025
No ratings yet
CH817 Lecture 02 2025
36 pages
Selection Presentation
No ratings yet
Selection Presentation
15 pages
Boiler Efficiency Calculation
No ratings yet
Boiler Efficiency Calculation
4 pages
Causes of Coupling Failures
100% (1)
Causes of Coupling Failures
6 pages
SteamPowerPlantPipingSystem 10313009
No ratings yet
SteamPowerPlantPipingSystem 10313009
507 pages
Startup Steam Required
No ratings yet
Startup Steam Required
4 pages
Refining High Acid Crude
No ratings yet
Refining High Acid Crude
10 pages
FRP Tank Selip Technical Specification
100% (3)
FRP Tank Selip Technical Specification
29 pages
Combustion Calculation
No ratings yet
Combustion Calculation
2 pages
P&ID Symbols and Legends
87% (15)
P&ID Symbols and Legends
69 pages
Plano Rizzi Sopladores B 6102 FCC
No ratings yet
Plano Rizzi Sopladores B 6102 FCC
2 pages
Air Damper Control System671 PDF
No ratings yet
Air Damper Control System671 PDF
1 page
Steam Boilers Veloa
No ratings yet
Steam Boilers Veloa
96 pages
Esp 100 TPD Precicon 3 Drawing PDF
No ratings yet
Esp 100 TPD Precicon 3 Drawing PDF
1 page
Chimney Calculations
No ratings yet
Chimney Calculations
3 pages
Process Control Boiler
No ratings yet
Process Control Boiler
70 pages
Part 8 Exhaust Fan
No ratings yet
Part 8 Exhaust Fan
7 pages
Boiler Safety Checklist &
No ratings yet
Boiler Safety Checklist &
185 pages
Tank Size
No ratings yet
Tank Size
7 pages
Furnace Typical Draft Profile
No ratings yet
Furnace Typical Draft Profile
1 page
Rabin Karp Alorithm For String Search
No ratings yet
Rabin Karp Alorithm For String Search
3 pages
Problem Set 4 Conic Sections Set A
No ratings yet
Problem Set 4 Conic Sections Set A
2 pages
Power System Security: Definitions and Analysis
No ratings yet
Power System Security: Definitions and Analysis
21 pages
Save My Calculus - Part 2 - Equation of Tangent
No ratings yet
Save My Calculus - Part 2 - Equation of Tangent
27 pages
FCGuide 3
No ratings yet
FCGuide 3
62 pages
Test Maths
No ratings yet
Test Maths
5 pages
Comprehensive Statistics Guide
No ratings yet
Comprehensive Statistics Guide
81 pages
SpeedometryActivitiesAtHome PDF
No ratings yet
SpeedometryActivitiesAtHome PDF
13 pages
Maths PB1
No ratings yet
Maths PB1
7 pages
RVSP Notes
89% (9)
RVSP Notes
123 pages
Physics 1
No ratings yet
Physics 1
25 pages
QM I Lec 6 Three Dimensional Schrodinger Equation
No ratings yet
QM I Lec 6 Three Dimensional Schrodinger Equation
54 pages
Matrix Methods for Simultaneous Equations
No ratings yet
Matrix Methods for Simultaneous Equations
11 pages
Pier Luigi Mazzeo: Sift & Matlab
No ratings yet
Pier Luigi Mazzeo: Sift & Matlab
36 pages
Numerical Methods for Engineers
No ratings yet
Numerical Methods for Engineers
13 pages
Maths Project Rough
No ratings yet
Maths Project Rough
5 pages
Y4 Baseline Assessment Ingles 4
No ratings yet
Y4 Baseline Assessment Ingles 4
4 pages
N Rudenko S Palamar N Nezhyva G Bondarenko D Shyrokov
No ratings yet
N Rudenko S Palamar N Nezhyva G Bondarenko D Shyrokov
10 pages
Formation Academic Year: X y Z X y Z U P
No ratings yet
Formation Academic Year: X y Z X y Z U P
1 page
IKS Sample Case Study Topics
No ratings yet
IKS Sample Case Study Topics
5 pages
PDE Yale LecNotes
No ratings yet
PDE Yale LecNotes
61 pages
Saudi Arabia Booklet 2022
No ratings yet
Saudi Arabia Booklet 2022
54 pages
Measurement Errors & Instruments
No ratings yet
Measurement Errors & Instruments
24 pages
F13 341 Book Sec 8-4
No ratings yet
F13 341 Book Sec 8-4
2 pages
AIMO Progress Prize 2 Reference Problems Solutions
No ratings yet
AIMO Progress Prize 2 Reference Problems Solutions
11 pages
Chapter 3 Geometry Notes
No ratings yet
Chapter 3 Geometry Notes
16 pages
Fuzzy Logic (1) : Intelligent System Course
No ratings yet
Fuzzy Logic (1) : Intelligent System Course
23 pages
Extreme Value
100% (1)
Extreme Value
29 pages

Excel Regression

Uploaded by

Excel Regression

Uploaded by

Spreadsheet Problem Solving

fitting models to data

Review of Straight-line Linear Regression

For each data point, there is an error between that

Finding the model parameters that give the best fit

So, the problem reduces to finding the minimum of

Fitting a straight-line model to data

and Excel solves problems like this with either formulas

Example: straight-line fit

Transfer the data to an Excel spreadsheet

CO2 Emissions for the US

CO2 Emissions (MMT C)

Calculating the slope and intercept using Excel formulas

The formulas behind the numbers

Using the model straight-line equation to compute

and copy these

CO2 Emissions for the US

CO2 Emissions (MMT C)

Using an alternate, shortcut approach

Start with a simple graph of the data

The Add Trendline dialog box

Initial form of graph with straight-line added

CO2 Emissions (MMT C)

CO2 Emissions for the US

CO2 Emissions (MMT C)

Looks just like before, but we got there quicker

A 2nd alternate approach

Data Analysis Regression tool

Regression dialog box set up for our problem

Initial (poorly formatted) Regression output display

Final Display of Regression Output

Judging Goodness of Fit

correlation coefficient: if close

gives an idea of how

Adjusted R2 or Standard Error can be used to compare

Judging whether terms belong in the model

P-values that are quite small, like

The Data Analysis Regression tool appears much more

More complicated models

Note: it is called linear regression,

General linear models

polynomial models above

y a f1 x1 ,x2 ,K b f 2 x1 ,x2 ,K c f 3 x1 ,x2 ,K L

y a bx1 cx2 dx1 x2

We can use the Data Analysis Regression tool for everything

Example: polynomial regression

Viscosity of Water at Atmospheric Pressure

Setting up for polynomial fits

Select for quadratic model, etc

Data Analysis Regression tool

check Labels because

Quadratic model regression results

Quadratic model really doesnt capture behavior of data

Continue with fits of cubic, 4th- & 5th-order polynomials

Looks like 5th-order offers best performance

Viscosity of Water at Atmospheric Pressure

Precautions on polynomial fitting

Example: multi-linear regression

X-input range includes

Multilinear Model Performance

Model performance isnt that

Note: for multi-linear models, we plot Predicted vs Measured y.

Find the values of a and b

Strategy for Nonlinear Regression

by both ideal gas law

van der Waals Calculation

Setting up Solver Parameters

Fit of van der Waals Eqn for SO2

Predicted Pressure (Pa)

Measured Pressure (Pa)

You might also like