0% found this document useful (0 votes)

6 views69 pages

Lecture3 25

Uploaded by

michaelwebbie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views69 pages

Lecture3 25

Uploaded by

michaelwebbie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Predictive Analytics

Lecture 3
Herdis Steingrimsdottir
Exam
Exam guidelines

See guidelines on canvas

Individual paper, 10 pages
Use at least 2 methods presented in lecture to produce forecast (not
including judgmental forecast)
Find your own data (you can not use data that you have used in
exercise classes)
Exam guidelines

Be clear and concise

What is your research question?
Why are you looking at this particular question/time series (is it motivated
by theory? Claims in the news? Other papers/reports?)
What do you expect to find and why?
Remember to describe and plot your data
Discuss any challenges/problems in your data or your data analysis. Explain
how you deal with these issues and why.
Imagine a statistician and a CEO will read your report … it should be
informative for both
Chapter 5:
Forecasting workflow
Look at some simple forecasting methods (average,
naïve, drift)
Fitted values and residuals
Today: Point forecasts, forecast intervals, and forecast
accuracy
Time series cross-valdiation
Chapter 6:
Judgemental (qualitative forecasts)
A tidy
forecasting
workflow
Tidy = data preparation

Visualize = plot the data

A tidy Specify = define a model
forecasting
workflow Estimate = train the model

Evaluate = check model performance

Forecast
The first thing to do is always to look at your data
Make sure that it is in the correct format
1. Tidy Make sure that everything looks as it should
Check for missing values (how will you deal with them?)
Check for extreme values (how will you deal with them?)
Plotting the data is an essential step in understanding
the data
Plotting and summarising the data can be a party of
2. Visualise tidying the data
Plotting and summarising the data allows us to identify
patterns and specify an appropriate model
Many different time series models can be used for
forecasting – we will spend much of our time over the
next weeks learning about the different models
Choosing the appropriate model is key for producing
approporiate forecasts
3. Specify We choose a model or method that might best capture
the patterns we see in the data – you can think of it as
forming a hypotheses regarding how your data
behaves
We will see some simple examples today, e.g. a naïve
model, a simple trend model, a seasonal model, etc…
Once we have chosen our model we can run some
estimations on our data
We want to estimate the parameters of our model
using our data
E.g. we have a time series that has a strong trend –
looking at the data we think that our variable has a
linear time trend – we now use our data to estimate
4. Estimate how much our variable grows per time unit
We want to check whether our model does a good job
of capturing the patterns in the data – or whether it
needs adjustment
This step is about checking how well our model
performs
We can compare the predictions against actual
5. Evaluate outcomes– or use statistical measures to assess
accuracy
We want to check whether our model does a good job
at predicted future outcomes
When the model has been specified, estimated and
evaluated it is time to produce the forecasts about the
future values of the time series
One thing we need to specificy is the forecast horizon,
i.e. the number of future observations to forecast
6. Forecast As we have talkes about, our forecasts generally
become more uncertain further into the future – we will
communicate this uncertainty by looking at prediction
intervals
Our main focus is on steps 3-5: Specify,
Today Estimate and Evaluate
#specify:
Some simple
forecasting methods
Sometimes very simple forecasting methods can be
extremely effective

Some simple We want our more complicated methods to (at least)

perform better than these simple forecasting methods
forecasting We will now look at four simple forecasting methods
methods that we will uses as benchmarks throughout the course
I.e. we will use these benchmark methods to evaluate
our forecasting models
Average (or mean) method
Some simple Naïve method
forecasting
Seasonal naïve method
methods
Drift method
The forecast of all future values is equal to the average
(or mean) of the historical data
If we have historical data denoted by 𝑦! , 𝑦" , … , 𝑦# then
Average we can write the forecast as 𝑦$ #$%|# = 𝑦& = (𝑦! + ⋯ +
method 𝑦# )/𝑇
Where 𝑦$ #$%|# is the estimate of 𝑦$ #$% given the data
that we have at time T, i.e., 𝑦! , 𝑦" , … , 𝑦#
Simple example: a small store is keeping track of
monthly sales of a new candy bar – over the last month
it has sold 100, 120, 110, 130, 125 and 115 units.
The average method would forecast the next month’s
sale by calculating the average of these numbers, i.e.
Average (110+120+110+130+125+115)/6 ≈ 116.67 units
method
The method is therefore assuming that the underlying
process is stable over time
This might be reasonable when we have a stable time
series with no strong trend or seasonal effects present
Average
method
In the naïve forecast method we simply set all
Naïve method forecasts to be the value of the last observation,
i.e., 𝑦$ #$%|# = 𝑦#
I.e. naïve method simply assumes that the next value(s)
will be exactly the same as the most recent observed
value
In the example before where we had monthly sales in
the last months being 100, 120, 110, 130, 125 and 115
units – we would simply take the last value, 115 – and
Naïve method predict that sales in the next month are als 115
The logic is that if nothing drastic happens, the latest
values is often a reasonable guess for the values in the
near future
Similar to average method, it works best when there is
no trend or seasonality
Naïve method
Same idea as in the naïve method but accounting for
seasonality (useful when we have significant
seasonality in the time series)
In this method we set each forecast to be equal to the
last observed value from the same season (e.g. the
Seasonal naïve same month in previous year)
method We can write this as 𝑦$ #$%|# = 𝑦#$%()(+$!)
Where m is the seasonal period, and k is the integer
part of (h-1)/m
i.e., the number of complete years in the forecast
period prior to time T+h
Seasonal naïve I.e. if we are looking at monthly sales, and we want to
predict the sales in December 2025, our prediction
method would be the value of sales in December 2024
Seasonal naïve
method
How much the values change over time (i.e., the
drift) is set to be the average chanage in the
historical data
I.e. we look at the difference between the first and
the last observation, and then assume the time
series will continue
The forecast for time T+h is then given by:
Drift method

This is equivalent to drawing a line between the first

and last obs, and extrapolate the line into the future
Simple example: Lets say we have monthly sales over
10 months. The units sold in the first month = 100, the
units sold in the 10th month = 150
The drift per period is (150-100)/(10-1) ≈ 5.56 units
Drift method To forecast sales in the 11th month, we add the drift to
the value in month 10: i.e. 150 + 5.56 ≈ 155.56 units
The predicted value for the 12th month would be 150 +
2*5.56 ≈ 161,12 units
Drift method
Finding the drift is part of #estimating our model,i.e.,
we use the data that we have to find the relevant
paramater
Drift method
The drift method can be useful when there is trend in
the data, but no clear seasonal pattern
Comparing the
methods
Comparing the
methods
Comparing the methods
Comparing the
methods
We use these simple methods mainly as a
benchmark
Our aim in this class is to find methods that
allow us to get better forecasts than these
simple methods J
Comparing the
But sometimes these simple methods will
methods be the best forecasting method available
As you could see in the previous slides the
type of time series that you are working
with may often tell you which of these
simple methods will work the best
Fitted values and
residuals
Fitted values

We want to check how well the model fits the data

Note that here we are not talking about forecasts, but the historical
data
So when we have specified our model, and estimated the relevant
parameters we look at the fitted values i.e., the values that are
predicted by the model
We are using the data that we already have, to see whether our
model does a good job fitting that data
Fitted values –
naïve model
Fitted values –
drift model
Fitted values, 𝑦.- are our model’s prediction
for the future values in our time series

Fitted values Residuals are the difference between the

fitted values and actual values:
and residuals
𝑒- = 𝑦- − 𝑦.- i.e. everything that is left after
we fit the model
Fitted values
and residuals
(naïve method)
Fitted values
and residuals
(naïve method)
Fitted values and
residuals
Each observation contains
information on Student-to-
teacher ratio (the x variable)
and test score (the y variable)
This is not a time series
forecast, but we can use this
example to look at fitted
values and residuals
The blue line shows the fitted
values
The distance between the blue
line and what we actually
observe (the dots) are the
residuals
Ideally, the residuals should behave like a random noise:

They should heave a mean of zero. This means that the

model is not systematically overestimating or
underestimating the data
Variance should be constant – indicating that the model
accuracy is not varying over the time series
The residuals should be free from autocorrelation, i.e. they
should not be correlated with each other over time. If there is
autocorrelation in the residuals it suggests that there is some
underlying structure in the data that our model has failed to
capture
To make statistical inference, it is desirable that the residuals
are normally distributed

Residuals
The first best would be for the residuals to be white
noise and to follow a normal distribution:

1. Residuals are uncorrelated

2. Residuals have zero mean
Properties of 3. Residuals have constant variance
4. Residuals have a normal distribution
residuals
In practice, we focus on that the residuals are
uncorrelated and that they have zero
We can correct for non-constant variance and the
distribution only plays a role for the prediction
interval
Checking the
residuals:

Time series
showing
google daily
stock prices
We apply the
naïve method
and then plot
the residuals
over time
To check the
distribution of
the residuals
we can look at
a histogram
We plot the
ACF for the
residuals to
check for
autocorrelation
The fitted values and the residuals give us an idea of
how well our model captures the observed data
This also tells us something about forecast uncertainty
Model If the fitted values closely match the observed data,
accuracy and and the residuals are well-behaved we have greater
forecast confidence in the model’s ability to capture the
underlying patterns
uncertainty However, there is always some unexplained variability
There is also some risk that the model is “overfitting”
the data
In our forecasts this uncertainty is captured through
prediction intervals
The prediction interval provides a range around the
Model point forecast within which future observations are
accuracy and likely to fall
forecast Essentially, the prediction intervals are constructed
based on the behaviour of the residuals
uncertainty
If the residuals are normally distributed, we can use
their standard deviation to calculate intervals with a
specific confindence interval
Prediction
intervals
Prediction intervals tell us about the uncertainty in our
forecasts
If we only produce point forecasts – we are not able to
Prediction evaluate how accurate the forecasts are
intervals The prediction intervals make it clear how much
uncertainty is associated with each forecast
Point forecast have little value without accompanying
prediction intervals
How well does
Evaluating:
it fit the data?
How well does
our forecasting
method do? How well does
it predict what
will happen?
The fitted values and the residuals told us something
about how well our model fits the data that we already
have
Evaluating the However, this only captures how the model performs
on historical data – and does not necessariy indicate
accuracy of our how the model will perform in the future
forecast Sometime small residuals can be misleading, as they
can be due to the model “overfitting” the data –
meaning that it is capturing noise rather than the
underlying patterns
To check how reliable our forecast is a common
practice is to divide the available data into two parts: a
Evaluating the training set and a test set
accuracy of our The training set is used to build and estimate the
model, to generate the fitted valus and residuals
forecast
The test set is kept aside and only used to evaluate the
model’s forecasting performance
Forecast errors: splitting the sample

The idea is to forecast the values of the series we observe

in the test set as if they were unknown
I.e. we want to evaluate the accuracy of our method using
data that was not used to fit our model
Time series cross-validation means doing this many times
by choosing different thresholds for splitting the sample
each time
Forecast errors: splitting the sample & forecasting
one period ahead (h=1)
Forecast errors: splitting the sample & forecasting four periods
ahead (h=4)
A forecast error is the difference between the actual
value and the forecasted value
It measures how fare off the prediction is from what
actually happens
E.g. we forecast that sales next month are 120 unites.
The actual sales turn out to be 130 units. The
forecasting error is 10 units.
Forecast errors Note that a forecast error is not the same thing as a
residual
The residual tells us how well the model fits the
training data, i.e., it is looking at data that was used to
estimate our model
The forecasting error takes our model to new data –
and looks at how well it does
The forecast error is (h-step ahead)

𝑒#$% = 𝑦#$% − 𝑦$ #$%|#

How accurate Forecasts error are not the same as residuals
is our forecast? In general:

We use training set to estimate

-Forecast
We use test set to evaluate
errors
We use forecast errors to tell evaluate forecast
accuracy
Because positive and negative errors partly cancel each
other out, we want to create measures based on
absolute values of the forecast errors
Forecast errors Because the size of the error terms depend on how our
& measures of outcome variables we want to scale them to get a
consistent way of measuring the forecast accuracy
forecast E.g.,
accuracy Percentage error
Mean absolute scale error (MASE)
Root mean squared scale error (RMSSE)
We can use these different measures to compare the
performance of our forecasting models
A short discussion of
qualitative methods
(Chapter 6)
Judgmental forecasting
• Useful when:
• We do not have any historical data: Occurs when
historical data is absent, e.g., new product launches,
market entries, or unprecedented policy changes.
• When our data is incomplete: Situations where
data are incomplete or delayed

• Example: How do we forecast the future sales of

Apple Vision Pro

• Key Factors for Accuracy:
• Domain Knowledge: Forecasters benefit from expertise.
• Timely Information: Up-to-date information enhances accuracy.
• Structured Approaches:
• Systematic Implementation: Quality of judgmental forecasts
improves through well-structured and systematic approaches.
• Limitations: Acknowledging subjectivity and limitations of
judgmental forecasting.
Judgemental • Three General Settings for Judgmental Forecasting:
• No Available Data: Judgmental forecasting is the sole approach.
forecasting • Data Available with Statistical Forecasts: Adjusting statistical
forecasts using judgment.
• Data Available with Independent Forecasts: Generating
independent statistical and judgmental forecasts, followed by
combination.
• NOTE: Statistical methods are preferable when data are available,
serving as a starting point for more accurate forecasts.
The Delphi process involves several key stages:
1. assembling a panel of diverse experts
2. setting and distributing forecasting tasks
Example, the
3. collecting initial forecasts and justifications
Delphi method
4. providing feedback to the experts
5. iterating through the process until a satisfactory level
of consensus is achieved.
One distinctive feature of the Delphi Method is the emphasis on
maintaining the anonymity of participating experts throughout
the process.
This ensures that individual judgments are not influenced by
political, social pressures, or personal dynamics within the group.
Example, the An appointed facilitator plays a crucial role in orchestrating the
Delphi process, responsible for designing and administering the
Delphi method method, providing feedback to experts, and generating final
forecasts.
The Delphi Method offers a valuable approach to judgmental
forecasting, particularly in situations where historical data is
lacking, and insights from a diverse group of experts are essential
for informed decision-making.
Discussion

Form an expert panel with 2-3 of your classmates

Your panel is supposed to predict the salary for university graduates
over the next 25 years. You want to take into account how AI may
reshape the labor market and impact future productivity and wages.
What kind of information and expertise could help in your
assessment? How could you combine judgemental and data to
improve your forecast? What are your own thoughts and
expectations on how this may impact future wages? Which
mechanisms may be important?
Next week We discuss exponential smoothing (Chapter 8)

5 Forecasting-Ch 3 (Stevenson) PDF
No ratings yet
5 Forecasting-Ch 3 (Stevenson) PDF
50 pages
Times Series 2025
No ratings yet
Times Series 2025
158 pages
Times Series 2025 18th Session
No ratings yet
Times Series 2025 18th Session
148 pages
Operations Management Chapter 3 - Forecasting
100% (2)
Operations Management Chapter 3 - Forecasting
44 pages
Times Series 2025 17th Session
No ratings yet
Times Series 2025 17th Session
120 pages
Lecture1 25
No ratings yet
Lecture1 25
67 pages
Times Series 2025 15th Session
No ratings yet
Times Series 2025 15th Session
69 pages
Forecasting Techniques Guide
No ratings yet
Forecasting Techniques Guide
70 pages
Endsem Notes 5
No ratings yet
Endsem Notes 5
218 pages
The Decision Process
No ratings yet
The Decision Process
8 pages
BA Mid-2
No ratings yet
BA Mid-2
15 pages
CH 18
No ratings yet
CH 18
42 pages
Demand Forecasting-Quantitative Methods
No ratings yet
Demand Forecasting-Quantitative Methods
47 pages
Forecasting
No ratings yet
Forecasting
50 pages
Ch07 - Forecast
No ratings yet
Ch07 - Forecast
20 pages
Forecasting - Introduction
No ratings yet
Forecasting - Introduction
72 pages
Methods of Forecasting in A Manufacturing Company
100% (1)
Methods of Forecasting in A Manufacturing Company
31 pages
Topic 6
No ratings yet
Topic 6
34 pages
Unit 2
No ratings yet
Unit 2
37 pages
Forecasting Group 7
No ratings yet
Forecasting Group 7
38 pages
Stevenson Chapter 3 - Forecasting
No ratings yet
Stevenson Chapter 3 - Forecasting
60 pages
Forecasting
No ratings yet
Forecasting
36 pages
Forecasting
No ratings yet
Forecasting
9 pages
03 Quantitative Method in Forecasting
No ratings yet
03 Quantitative Method in Forecasting
16 pages
Management Science - Module 2.2 Forecasting Models
No ratings yet
Management Science - Module 2.2 Forecasting Models
33 pages
Lecture 4 Forecasting
No ratings yet
Lecture 4 Forecasting
70 pages
Forecasting Methods Guide
No ratings yet
Forecasting Methods Guide
62 pages
Demand Forecasting
No ratings yet
Demand Forecasting
26 pages
Branch & Bound for Optimization
No ratings yet
Branch & Bound for Optimization
4 pages
Forecasting
No ratings yet
Forecasting
25 pages
Forcasting
No ratings yet
Forcasting
86 pages
Forecasting
No ratings yet
Forecasting
47 pages
Unit 3 B Time Series Analysis
No ratings yet
Unit 3 B Time Series Analysis
37 pages
Forecasting Essentials for Managers
No ratings yet
Forecasting Essentials for Managers
52 pages
Chapter 2 - Forecasting
No ratings yet
Chapter 2 - Forecasting
56 pages
Forecasting: Statistics 2 ITESM Campus Guadalajara Prof: Ing. Maria Luisa Olascoaga
No ratings yet
Forecasting: Statistics 2 ITESM Campus Guadalajara Prof: Ing. Maria Luisa Olascoaga
110 pages
Forecasting
No ratings yet
Forecasting
4 pages
Forecasting: Mcgraw-Hill/Irwin
No ratings yet
Forecasting: Mcgraw-Hill/Irwin
42 pages
AE 413 Lecture 4 - Forecsating Typologies 2021-2022
No ratings yet
AE 413 Lecture 4 - Forecsating Typologies 2021-2022
17 pages
Chapter 5 - Basic Forecasting Method - 2024
No ratings yet
Chapter 5 - Basic Forecasting Method - 2024
15 pages
Operations Management Lecture 7
No ratings yet
Operations Management Lecture 7
40 pages
FASE II - Tema 6
No ratings yet
FASE II - Tema 6
59 pages
Statistical Methods For Revenue Estimation
No ratings yet
Statistical Methods For Revenue Estimation
10 pages
AE 413 Lecture 5 - Forecsating Typologies 2023-2024-1
No ratings yet
AE 413 Lecture 5 - Forecsating Typologies 2023-2024-1
12 pages
CH 17
No ratings yet
CH 17
12 pages
Simple Exponential Smoothing & Forecasting Methods and Serial Dependence
No ratings yet
Simple Exponential Smoothing & Forecasting Methods and Serial Dependence
19 pages
Categories of Forecasting Methods: Qualitative vs. Quantitative Methods
No ratings yet
Categories of Forecasting Methods: Qualitative vs. Quantitative Methods
5 pages
MM CH 2
No ratings yet
MM CH 2
12 pages
IE3265 Forecasting
No ratings yet
IE3265 Forecasting
61 pages
Forecasting: Jump To Navigationjump To Search
No ratings yet
Forecasting: Jump To Navigationjump To Search
3 pages
Chapter 10 - Forecasting
No ratings yet
Chapter 10 - Forecasting
7 pages
Forecasting - Wikiwand
No ratings yet
Forecasting - Wikiwand
9 pages
Session 07
No ratings yet
Session 07
41 pages
Forecasting: - What Is Forecasting ? - Why Forecasting ? - How To Forecast ? Some of The Models
No ratings yet
Forecasting: - What Is Forecasting ? - Why Forecasting ? - How To Forecast ? Some of The Models
21 pages
Forecast
No ratings yet
Forecast
48 pages
Forecasting
No ratings yet
Forecasting
39 pages
Two Types of Data
No ratings yet
Two Types of Data
14 pages
Group 2 Forecasting.
No ratings yet
Group 2 Forecasting.
7 pages
Thermodynamics Relations
No ratings yet
Thermodynamics Relations
2 pages
Week 3c - Phylogenetic - Tree - ConstructionMai PDF
No ratings yet
Week 3c - Phylogenetic - Tree - ConstructionMai PDF
19 pages
Forecasting
No ratings yet
Forecasting
93 pages
Stata Practical Multilevel
No ratings yet
Stata Practical Multilevel
23 pages
Time Series Analysis
No ratings yet
Time Series Analysis
49 pages
M.Tech Power Systems QBank
No ratings yet
M.Tech Power Systems QBank
6 pages
Part - B: Common Data For Question 1 and 2
No ratings yet
Part - B: Common Data For Question 1 and 2
15 pages
JD Campus Quant Researcher
No ratings yet
JD Campus Quant Researcher
2 pages
UPTU B.tech Computer Science 3rd 4th Yr
No ratings yet
UPTU B.tech Computer Science 3rd 4th Yr
51 pages
FFT Analysis for MATLAB Users
No ratings yet
FFT Analysis for MATLAB Users
5 pages
14the Normal Distribution - Worksheet
No ratings yet
14the Normal Distribution - Worksheet
10 pages
SSNAO Dupliant
No ratings yet
SSNAO Dupliant
9 pages
Partitioning Algorithms
No ratings yet
Partitioning Algorithms
5 pages
Econometrics1 Syllabus Handout
No ratings yet
Econometrics1 Syllabus Handout
3 pages
(1-4) Vector Calculus-Differential Part
No ratings yet
(1-4) Vector Calculus-Differential Part
90 pages
Monte Carlo
No ratings yet
Monte Carlo
4 pages
Advanced Control Strategies
No ratings yet
Advanced Control Strategies
29 pages
Queue Simulation & Probability Solutions
No ratings yet
Queue Simulation & Probability Solutions
66 pages
RLC GenRLCSolnSummary
No ratings yet
RLC GenRLCSolnSummary
3 pages
Box-Packing Puzzles
No ratings yet
Box-Packing Puzzles
4 pages
6 DUALITY Theory
No ratings yet
6 DUALITY Theory
25 pages
Network Security Unit-2
No ratings yet
Network Security Unit-2
17 pages
BDA Quiz 2 Help
No ratings yet
BDA Quiz 2 Help
4 pages
01 Interpolation 1
No ratings yet
01 Interpolation 1
14 pages
Lab-Assignment 4
No ratings yet
Lab-Assignment 4
14 pages
Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions
No ratings yet
Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions
13 pages
Ca 1
No ratings yet
Ca 1
24 pages
Analisis Signal-To-Noise Ratio Pada Sinyal Audio Dengan Teknik Konvolusi
No ratings yet
Analisis Signal-To-Noise Ratio Pada Sinyal Audio Dengan Teknik Konvolusi
9 pages
A Branch and Cut Algorithm For The Dial2
No ratings yet
A Branch and Cut Algorithm For The Dial2
25 pages
Feynman Propagator in Scalar Fields
No ratings yet
Feynman Propagator in Scalar Fields
15 pages

Lecture3 25

Uploaded by

Lecture3 25

Uploaded by

Predictive Analytics

 See guidelines on canvas

 Be clear and concise

Visualize = plot the data

Evaluate = check model performance

Some simple  We want our more complicated methods to (at least)

 This is equivalent to drawing a line between the first

 We want to check how well the model fits the data

Fitted values  Residuals are the difference between the

 They should heave a mean of zero. This means that the

1. Residuals are uncorrelated

 The idea is to forecast the values of the series we observe

𝑒#$% = 𝑦#$% − 𝑦$ #$%|#

 We use training set to estimate

• Example: How do we forecast the future sales of

 Form an expert panel with 2-3 of your classmates

You might also like

See guidelines on canvas

Be clear and concise

Some simple We want our more complicated methods to (at least)

This is equivalent to drawing a line between the first

We want to check how well the model fits the data

Fitted values Residuals are the difference between the

They should heave a mean of zero. This means that the

The idea is to forecast the values of the series we observe

We use training set to estimate

Form an expert panel with 2-3 of your classmates