0% found this document useful (0 votes)

9 views3 pages

MIS BA Solution Chapter03

Uploaded by

xujie623

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views3 pages

MIS BA Solution Chapter03

Uploaded by

xujie623

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

UCD Business Analytics - Practical Sheet Solution

Miguel Nicolau

Chapter 3: Linear Regression

Exercise 1
Table 1: Employees and sales in a small sample of companies.

employees sales (thousands of Euros)

1 15
4 25
5 100
7 120

1. For the data shown in Table 1, calculate a linear regression model of the form y = a + bx, using
employees as the predictor and sales as the response. Apply the Least Squares method, using the
formulas below.

P P P
n · xy − x y
b= P 2 P
n · x − ( x)2
a = ȳ − bx̄

Solution
This exercise instructs us to use employees as the predictor variable (x), and sales as the response
variable (y). So in order to apply the Least Squares equations to obtain the a and b coefficients, we
need to calculate some required values:

• n (number of samples): 4;
•
P
xy (sum of each x value multiplied by corresponding y value): 1×15+4×25+5×100+7×120 =
1455;
•
P
x (sum of all x values): 1 + 4 + 5 + 7 = 17;
•
P
y (sum of all y values): 15 + 25 + 100 + 120 = 260;
•
P 2
x (sum of each x value squared): 12 + 42 + 52 + 72 = 91;
• ( x) (sum of all x values, squared): 172 = 289;
P 2

• x̄ (average of all x values): 17/4 = 4.25;

• ȳ (average of all y values): 260/4 = 65.

So the slope will be:

4 × 1455 − 17 × 260 5820 − 4420 1400
b= = = = 18.667
4 × 91 − 289 364 − 289 75

1
And the intercept will be

a = 65 − 18.667 × 4.25 = 65 − 79.335 = −14.335

2. Calculate the predictions of the model for each of the data points of the training set (i.e. x = 1, 4, 5, 7).

Solution
f (1) = −14.335 + 18.667 ∗ 1 = 4.332
f (4) = −14.335 + 18.667 ∗ 4 = 60.333
f (5) = −14.335 + 18.667 ∗ 5 = 79
f (7) = −14.335 + 18.667 ∗ 7 = 116.334
3. Calculate the train RMSE and R2 , using the formulas below.
r Pn
i=1 (yi − (a + bxi ))2
RMSE =
n
Pn 2
2 i=1(yi − (a + bxi ))
R =1− Pn 2
i=1 (yi − ȳ)

Solution
r
(15 − 4.332)2 + (25 − 60.333)2 + (100 − 79)2 + (120 − 116.334)2
RMSE =
4
r
10.6682 + (−35.333)2 + 212 + 3.6662
=
4
r
113.806 + 1248.421 + 441 + 13.440
=
4
r
1816.667
=
4
√
= 454.167
= 21.311

1816.667
R2 = 1 −
(15 − 65)2 + (25 − 65)2 + (100 − 65)2 + (120 − 65)2
1816.667
=1−
(−50) + (−40)2 + 352 + 552
2

1816.667
=1−
2500 + 1600 + 1225 + 3025
1816.667
=1−
8350
= 1 − 0.218
= 0.782

4. Table 2 shows available test data. Use it to calculate test RMSE and R2 values. Which would you
typically expect to be a larger value: train RMSE or test RMSE? What about train vs. test R2 ?
2
Table 2: Test data for employees and sales

employees sales (thousands of Euros)

3 26
10 135

Solution
r
(26 − f (3))2 + (135 − f (10))2
RMSE =
2
r
(26 − 41.666) + (135 − 172.335)2
2
=
2
r
(−15.666) + (−37.335)2
2
=
2
r
245.424 + 1393.902
=
2
r
1639.326
=
2
√
= 819.663
= 28.63

1639.326
R2 = 1 −
(26 − 80.5)2 + (135 − 80.5)2
1639.326
=1−
(−54.5)2 + 54.52
1639.326
=1−
2970.25 + 2970.25
1639.326
=1−
5940.5
= 1 − 0.276
= 0.724

Typically you would expect train RMSE to be smaller than test RMSE, because the model was made
to fit the train data, and RMSE is an error measure.
Likewise, you would expect the train R2 to be higher than the test R2 , because the model was trained
using the variance of y from the train dataset.
5. For each of the points in Table 2, is that an in-sample or out-of-sample point?
In-sample data is the data that was used to train the model. Table 2 does not contain any data from
the training set (i.e. from Table 1), therefore both observations are out-of-sample data.
6. For each of your predictions for Table 2, is it an interpolation or extrapolation?
Interpolation basically means to make predictions for x values within the range of x values used to
train the model; extrapolation is the opposite. The range of x values used to train the model was
[1, 7]; this means that a prediction for x = 3 is an interpolation, whereas a prediction for x = 10 is an
extrapolation.

DS Unit 4
No ratings yet
DS Unit 4
21 pages
Day.10 Regression Evaluation Metrics MSE, RMSE, MAE, R-Squared
No ratings yet
Day.10 Regression Evaluation Metrics MSE, RMSE, MAE, R-Squared
8 pages
Velocity Petrel
100% (1)
Velocity Petrel
24 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
26 pages
Keyframe Animation
No ratings yet
Keyframe Animation
30 pages
Data Analytics Unit 3 Notes
100% (3)
Data Analytics Unit 3 Notes
28 pages
NM
No ratings yet
NM
12 pages
Numerical Differentiation and Integration
No ratings yet
Numerical Differentiation and Integration
17 pages
Analytics Compendium
No ratings yet
Analytics Compendium
41 pages
Fundamental Principles of Computed Tomography (CT) : For B.S. Radiologic Technology
100% (1)
Fundamental Principles of Computed Tomography (CT) : For B.S. Radiologic Technology
143 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
MLDAP Module2
No ratings yet
MLDAP Module2
32 pages
Econometrics Exam Guide
No ratings yet
Econometrics Exam Guide
19 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
10 pages
Part 4 Modeling Profitability Instead of Default
100% (7)
Part 4 Modeling Profitability Instead of Default
5 pages
MIKE SHE Basic Exercises
No ratings yet
MIKE SHE Basic Exercises
160 pages
ML 02 Regression 2
No ratings yet
ML 02 Regression 2
30 pages
Lec 1
No ratings yet
Lec 1
4 pages
MIS BA 20232024 Practical Chapter03
No ratings yet
MIS BA 20232024 Practical Chapter03
2 pages
CPSC 4830 2025summer Lecture 3
No ratings yet
CPSC 4830 2025summer Lecture 3
33 pages
Business Statistics, 5 Ed.: by Ken Black
No ratings yet
Business Statistics, 5 Ed.: by Ken Black
34 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Module 2
No ratings yet
Module 2
21 pages
Linear & Polynomial Regression Guide
No ratings yet
Linear & Polynomial Regression Guide
56 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Lecture 5 - Polynomial Regression Imran 07032025 114203am
No ratings yet
Lecture 5 - Polynomial Regression Imran 07032025 114203am
39 pages
t2 Sol
No ratings yet
t2 Sol
5 pages
Lecture 09 - 02.09.2024 - Regression-01
No ratings yet
Lecture 09 - 02.09.2024 - Regression-01
62 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Lect 10 Regression
No ratings yet
Lect 10 Regression
7 pages
Supply Chain Analytics
No ratings yet
Supply Chain Analytics
8 pages
Linear Regression
No ratings yet
Linear Regression
49 pages
Cost Function
No ratings yet
Cost Function
31 pages
DS P6 Yash
No ratings yet
DS P6 Yash
8 pages
Introduction Numerical Analysis
No ratings yet
Introduction Numerical Analysis
443 pages
U3 U4 Regression
No ratings yet
U3 U4 Regression
22 pages
Week 2
No ratings yet
Week 2
43 pages
Regression
No ratings yet
Regression
16 pages
Lecture3 Supervised Learning I
No ratings yet
Lecture3 Supervised Learning I
84 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
B.Tech Linear Regression Guide
No ratings yet
B.Tech Linear Regression Guide
6 pages
Chapter4 Regression
No ratings yet
Chapter4 Regression
15 pages
Linear Regression Lab Guide
100% (1)
Linear Regression Lab Guide
8 pages
Lecture 9-10
No ratings yet
Lecture 9-10
28 pages
DS Exp5
No ratings yet
DS Exp5
2 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
G Code
No ratings yet
G Code
15 pages
Econometrics Assignment Answer
No ratings yet
Econometrics Assignment Answer
13 pages
Regression
No ratings yet
Regression
45 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
30 pages
Article Module 4
No ratings yet
Article Module 4
8 pages
Section 2
No ratings yet
Section 2
22 pages
DEM4110 - Interpolation and Extrapolation - 2021
No ratings yet
DEM4110 - Interpolation and Extrapolation - 2021
74 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
10 - Regression Analysis
No ratings yet
10 - Regression Analysis
6 pages
Question 1 B
No ratings yet
Question 1 B
6 pages
Regression Performnace Metrics
No ratings yet
Regression Performnace Metrics
21 pages
2.3 ML (Implementation of Polynomial Regression Using Python)
No ratings yet
2.3 ML (Implementation of Polynomial Regression Using Python)
9 pages
PGP25116 - Soubhagya - Dash - DPolynomial Regression
No ratings yet
PGP25116 - Soubhagya - Dash - DPolynomial Regression
4 pages
Box and Doss Oct08 AVOtrends-GOM
No ratings yet
Box and Doss Oct08 AVOtrends-GOM
8 pages
TALLER #4 Interpolacion 2024 02
No ratings yet
TALLER #4 Interpolacion 2024 02
3 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Regression v33
No ratings yet
Regression v33
81 pages
Batoz1982 PDF
No ratings yet
Batoz1982 PDF
23 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
EndResult - English
No ratings yet
EndResult - English
49 pages
BCA Semester V Assignments
No ratings yet
BCA Semester V Assignments
19 pages
Stata Output Panel Hsiao 1986 Example
No ratings yet
Stata Output Panel Hsiao 1986 Example
5 pages
BA Test
No ratings yet
BA Test
7 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
Numerical Methods For Engineers and Scientists Using MATLAB® Ramin S. Esfandiari Ebook All Chapters PDF
100% (3)
Numerical Methods For Engineers and Scientists Using MATLAB® Ramin S. Esfandiari Ebook All Chapters PDF
65 pages
10 3389fenvs 2023 1228817
No ratings yet
10 3389fenvs 2023 1228817
21 pages
Reading 18 Linear Regression
No ratings yet
Reading 18 Linear Regression
22 pages
Department of Computer Science Engineering: Solution
No ratings yet
Department of Computer Science Engineering: Solution
5 pages
Multivariate Regression, Slides
No ratings yet
Multivariate Regression, Slides
61 pages
Presentasi Zen
No ratings yet
Presentasi Zen
20 pages
Interpolation: 7.2.1 Newton's Forward Interpolation Formula
No ratings yet
Interpolation: 7.2.1 Newton's Forward Interpolation Formula
22 pages
Jurnal Zafran New
No ratings yet
Jurnal Zafran New
15 pages
Machine Learning Problem Set
No ratings yet
Machine Learning Problem Set
5 pages
Lagrange Interpolation MATLAB Guide
No ratings yet
Lagrange Interpolation MATLAB Guide
3 pages
Esa Safrillah Nur Laila (039) UTS Statistika
No ratings yet
Esa Safrillah Nur Laila (039) UTS Statistika
11 pages
Python
No ratings yet
Python
10 pages
SARIMA Modeling for Australian Hotel Data
No ratings yet
SARIMA Modeling for Australian Hotel Data
4 pages

MIS BA Solution Chapter03

Uploaded by

MIS BA Solution Chapter03

Uploaded by

UCD Business Analytics - Practical Sheet Solution

Chapter 3: Linear Regression

employees sales (thousands of Euros)

• x̄ (average of all x values): 17/4 = 4.25;

So the slope will be:

a = 65 − 18.667 × 4.25 = 65 − 79.335 = −14.335

employees sales (thousands of Euros)

You might also like