0% found this document useful (0 votes)

16 views7 pages

Regression Stat Assignment

Uploaded by

Pritom Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views7 pages

Regression Stat Assignment

Uploaded by

Pritom Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

#Statistics Assignment: Generating Regression Model

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import linregress
from scipy.optimize import curve_fit

Here are all the imports for the regression model.

• numpy for number crunching

• matplotlib for ploting the model
• scipy.stats for straight line regression model generator
• scipy.optimize for curvy regression model

From http://www.statsci.org/data/general/brunhild.html, a dataset that measures the

concentration of a sulfate in the blood of a baboon named Brunhilda as a function of time was
found. The data table is presented here:

Hours Sulfate
2 15.11
4 11.36
6 9.77
8 9.09
10 8.48
15 7.69
20 7.33
25 7.06
30 6.7
40 6.43
50 6.16
60 5.99
70 5.77
80 5.64
90 5.39
110 5.09
130 4.87
150 4.6
160 4.5
170 4.36
180 4.27

Lets represent the data table as two numpy arrays for further mathematical queries

hours = np.array([2, 4, 6, 8, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80,
90, 110, 130, 150, 160, 170, 180])
sulfate = np.array([15.11, 11.36, 9.77, 9.09, 8.48, 7.69, 7.33, 7.06,
6.7, 6.43, 6.16, 5.99, 5.77, 5.64, 5.39, 5.09, 4.87, 4.6, 4.5, 4.36,
4.27])

#Task 1: Prepare a plot showing :

1. the data points and

2. the regression line in log-log coordinates.

To plot the data points in log-log coordinates we first plug in the hours and sulfate values to
numpy.log() function. Which returns log of each data-points.

log_hours = np.log(hours)
log_sulfate = np.log(sulfate)

Now we plot the log-log data points.

plt.scatter(log_hours, log_sulfate)
plt.xlabel('Log(Hours)')
plt.ylabel('Log(Sulfate)')
plt.title('Log-Log Plot of Sulfate Concentration vs. Time')
plt.grid(True)
For the regression line we use a tool from scipy.stats called linregress() function. This
function returns in the slope and the y-intercept of the model it is predicting.

slope, intercept, _, _, _ = linregress(log_hours, log_sulfate)

then, just plot the straightline.

plt.scatter(log_hours, log_sulfate)
plt.plot(log_hours, slope * log_hours + intercept, color='red',
label='Regression Line')
plt.xlabel('Log(Hours)')
plt.ylabel('Log(Sulfate)')
plt.title('Log-Log Plot of Sulfate Concentration vs. Time')
plt.grid(True)
plt.show()

#Task 2: Prepare a plot showing -

1. the data points and

2. the regression curve in the original coordinates.

First, plot the data points as is:

plt.scatter(hours, sulfate, label='Data Points')
plt.xlabel('Hours')
plt.ylabel('Sulfate')
plt.title('Plot of Sulfate Concentration vs. Time')
plt.grid(True)

Here, for the regression curve we need to use curve_fit() function from scipy.optimize.
We need to assume the type of curve e.g. sin/tan/y = mx+c/exponential.

For example: In this dataset, the points resembles exponential function. So we assume the
function to be,

this could be written in python like:

def demo_exp_function(x, a, b, c):

return a * np.exp(-b * x) + c

Now, we pass this fuction and the data points inside curve_fit() function. It will return in the
constants, for this case:
a, b and c are the constants.

constants, _ = curve_fit(demo_exp_function, hours, sulfate)

plotting the regression curve:

plt.scatter(hours, sulfate, label='Data Points')

plt.plot(hours, demo_exp_function(hours, *constants), color='red',
label='Regression Curve')
plt.xlabel('Hours')
plt.ylabel('Sulfate')
plt.title('Plot of Sulfate Concentration vs. Time')
plt.grid(True)

#Task 3: Plot the residual against the fitted values in log-log and in original coordinates.

The residual is the difference between the observed value of the dependent variable (in this case,
the sulfate concentration) and the value predicted by the regression model. In other words, it
represents the error or deviation of each data point from the fitted regression line or curve.
regression_line = slope * log_hours + intercept
residual = log_sulfate - regression_line

Now we plot residual vs fitted values:

plt.scatter(regression_line, residual)
plt.xlabel('Fitted Values (Log)')
plt.ylabel('Residual (Log)')
plt.title('Plot 5: Residual vs Fitted Values (Log-Log)')
plt.grid(True)

for the original coordinates it is the same.

regression_curve = demo_exp_function(hours, *constants)

residual_original_data_points = sulfate - regression_curve

Then, we plot Residual vs Fitted Values (Original)

plt.scatter(regression_curve, residual_original_data_points)
plt.xlabel('Fitted Values (Original)')
plt.ylabel('Residual (Original)')
plt.title('Plot 6: Residual vs Fitted Values (Original)')
plt.grid(True)

Task 4:
Use your plots to explain whether your regression is good or bad and why.

From plot 5: it is the regression line we previously calculated. A regression is Good or Bad is
determined by the residual. Here from plot 5 we can see our calculated residuals.

In this plot, if the regression model is a good fit, we would expect the residuals to be randomly
scattered around zero, indicating that the model captures the variation in the data well. The
residuals are close to zero and they are well distributed. So this model is a very GOOD model;

From plot 6: it is the regression curve for original data points. A regression is Good or Bad is
determined by the residual. Here from plot 6 we can see our calculated residuals.

Here in plot 6 the residuals are not close to zero and there is a density of residuals around (4-6)
along with Fitted values. This one is not a good model. In fact this one is pretty BAD at
predicting the future values.

Regression Stat Assignment
No ratings yet
Regression Stat Assignment
7 pages
Exp 4 - LM
No ratings yet
Exp 4 - LM
5 pages
Group Work Assignment Supervised and Unsupervised Learning
No ratings yet
Group Work Assignment Supervised and Unsupervised Learning
10 pages
Wa0002.
No ratings yet
Wa0002.
5 pages
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
No ratings yet
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
12 pages
Data Mining Lab: Regression & Clustering
No ratings yet
Data Mining Lab: Regression & Clustering
36 pages
Simple Linear Regression - Assign4
No ratings yet
Simple Linear Regression - Assign4
8 pages
R Lab 4
No ratings yet
R Lab 4
7 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
Assignment No.4 - (20-Ele-68)
No ratings yet
Assignment No.4 - (20-Ele-68)
17 pages
Data Analysis and Visualization Guide
No ratings yet
Data Analysis and Visualization Guide
16 pages
# Linear Regression
No ratings yet
# Linear Regression
3 pages
Machine Learning Lab: Regression Analysis
No ratings yet
Machine Learning Lab: Regression Analysis
15 pages
Diabetic Retinopathy Risk Modeling
No ratings yet
Diabetic Retinopathy Risk Modeling
24 pages
Machine Learning Algorithm Guide
100% (1)
Machine Learning Algorithm Guide
37 pages
Ayush ML 5
No ratings yet
Ayush ML 5
8 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Linear Regression with Scikit-Learn
No ratings yet
Linear Regression with Scikit-Learn
8 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
Lab1 Code PDF
No ratings yet
Lab1 Code PDF
3 pages
Lab 6 - Linear Regression and Multiple Linear Regression
No ratings yet
Lab 6 - Linear Regression and Multiple Linear Regression
12 pages
Exp2 Milf
No ratings yet
Exp2 Milf
7 pages
Ps 3
No ratings yet
Ps 3
16 pages
Experiment1 Explanation
No ratings yet
Experiment1 Explanation
6 pages
Neutralization Reaction Mathematics Internal Assessment: Candidate Number: Session: Supervisor
No ratings yet
Neutralization Reaction Mathematics Internal Assessment: Candidate Number: Session: Supervisor
14 pages
223a1131 ML Exp 1
No ratings yet
223a1131 ML Exp 1
8 pages
Practical 8
No ratings yet
Practical 8
5 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
ML Manual Final
No ratings yet
ML Manual Final
35 pages
Practical 5
No ratings yet
Practical 5
8 pages
ML
No ratings yet
ML
17 pages
20mia1006 FDA LAB REGRESSION TYPES
No ratings yet
20mia1006 FDA LAB REGRESSION TYPES
11 pages
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
No ratings yet
Write A Lab Report On Linear Regression and Logistic Regression. Include The Cost Function Differentiation and The Code in The Report.
7 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Assignment 7
No ratings yet
Assignment 7
4 pages
Cl-Vii Ass2 4301063
No ratings yet
Cl-Vii Ass2 4301063
5 pages
ML Regression for Data Scientists
No ratings yet
ML Regression for Data Scientists
7 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Da - Week 9
No ratings yet
Da - Week 9
20 pages
Linear Regression for Beginners
No ratings yet
Linear Regression for Beginners
5 pages
Worksheet of DAA
No ratings yet
Worksheet of DAA
5 pages
Simple Linear Regression - Assign
No ratings yet
Simple Linear Regression - Assign
8 pages
AI Lab9
No ratings yet
AI Lab9
5 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
No ratings yet
Summary of Topics For Midterm Exam #2: STA 371G, Fall 2017
6 pages
Simple Linear Regression - Assign2
No ratings yet
Simple Linear Regression - Assign2
9 pages
ML Lab Codes
No ratings yet
ML Lab Codes
14 pages
Message
No ratings yet
Message
5 pages
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
No ratings yet
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
30 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
UnivariateRegression Summary
No ratings yet
UnivariateRegression Summary
36 pages
ML Exp 3-7 Manuval
No ratings yet
ML Exp 3-7 Manuval
21 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
8 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Data Analysis for Beginners
No ratings yet
Data Analysis for Beginners
8 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
Deepak Data Analysis 1
No ratings yet
Deepak Data Analysis 1
31 pages
Lecture#7 - Flow Network Algorithm
No ratings yet
Lecture#7 - Flow Network Algorithm
40 pages
Slides
No ratings yet
Slides
41 pages
DP Matrix-Chain Multiplication (MCM) :: Questions
No ratings yet
DP Matrix-Chain Multiplication (MCM) :: Questions
3 pages
Regression Bhowal, Barua
No ratings yet
Regression Bhowal, Barua
12 pages
BUS203 Suggestions
No ratings yet
BUS203 Suggestions
2 pages
Lecture#6 - Branch-and-Bound Algorithm
No ratings yet
Lecture#6 - Branch-and-Bound Algorithm
32 pages
Error Calculations
No ratings yet
Error Calculations
10 pages
How Glass Is Recycled
100% (1)
How Glass Is Recycled
2 pages
Updated SoW
No ratings yet
Updated SoW
6 pages
Implementing Binary Adder and Subtractor Circuits: Laboratory Exercise 4
100% (1)
Implementing Binary Adder and Subtractor Circuits: Laboratory Exercise 4
11 pages
RP1
No ratings yet
RP1
2 pages
K80010292V03
No ratings yet
K80010292V03
2 pages
Hundred Restaurants List
No ratings yet
Hundred Restaurants List
12 pages
Magneto-Optical Kerr Effect Guide
No ratings yet
Magneto-Optical Kerr Effect Guide
22 pages
3.1 BSMarE 1st Yr Level - REVALIDA SET B
No ratings yet
3.1 BSMarE 1st Yr Level - REVALIDA SET B
11 pages
Christmas Drawing Easy - Google Search
No ratings yet
Christmas Drawing Easy - Google Search
1 page
PTY260S - Statistics Lecture 2019
No ratings yet
PTY260S - Statistics Lecture 2019
13 pages
Davanagere
No ratings yet
Davanagere
11 pages
Independent Proposal
No ratings yet
Independent Proposal
26 pages
Shop Christian Louboutin Loubi Girl 100 Leather Sandals Saks Fifth Avenue
No ratings yet
Shop Christian Louboutin Loubi Girl 100 Leather Sandals Saks Fifth Avenue
1 page
Star Trek: Borg Cube
No ratings yet
Star Trek: Borg Cube
4 pages
RSCP Rssi RTWP
100% (5)
RSCP Rssi RTWP
1 page
FNDS3536S-V3 Encoder Satellitegateway Iptv
No ratings yet
FNDS3536S-V3 Encoder Satellitegateway Iptv
4 pages
POLARIS RPG - Core Rulebook 1 Beta 05 (8527262) PDF
100% (1)
POLARIS RPG - Core Rulebook 1 Beta 05 (8527262) PDF
269 pages
Peh Reviewer
No ratings yet
Peh Reviewer
46 pages
S. Radhakrshinan
No ratings yet
S. Radhakrshinan
37 pages
Mastertop 1210i M 12-04
No ratings yet
Mastertop 1210i M 12-04
3 pages
A New Approach On Implementing TPM in A Mine. Chlebus2015
100% (1)
A New Approach On Implementing TPM in A Mine. Chlebus2015
12 pages
Chemistry Form Four: Chapter 9: Manufactured Substances in Industry
No ratings yet
Chemistry Form Four: Chapter 9: Manufactured Substances in Industry
18 pages
How To Compute Planetary Positions
100% (1)
How To Compute Planetary Positions
22 pages
UC Berkeley Course Reviews Summary
No ratings yet
UC Berkeley Course Reviews Summary
5 pages
Impacts On Water Environment: Prediction and Assessment of
No ratings yet
Impacts On Water Environment: Prediction and Assessment of
32 pages
Boracay Rehabilitation: Case Study Analysis " "
100% (1)
Boracay Rehabilitation: Case Study Analysis " "
4 pages
CH 7.5 - Cargo & Ballast Operations
No ratings yet
CH 7.5 - Cargo & Ballast Operations
472 pages
Automobile Engineering Course Plan
No ratings yet
Automobile Engineering Course Plan
2 pages
Eco Titrator
100% (1)
Eco Titrator
191 pages
Different Types of Pollution, English Essay, Project 2
No ratings yet
Different Types of Pollution, English Essay, Project 2
3 pages

Regression Stat Assignment

Uploaded by

Regression Stat Assignment

Uploaded by

#Statistics Assignment: Generating Regression Model

Here are all the imports for the regression model.

• numpy for number crunching

From http://www.statsci.org/data/general/brunhild.html, a dataset that measures the

#Task 1: Prepare a plot showing :

1. the data points and

Now we plot the log-log data points.

slope, intercept, _, _, _ = linregress(log_hours, log_sulfate)

then, just plot the straightline.

#Task 2: Prepare a plot showing -

1. the data points and

First, plot the data points as is:

this could be written in python like:

def demo_exp_function(x, a, b, c):

constants, _ = curve_fit(demo_exp_function, hours, sulfate)

plotting the regression curve:

plt.scatter(hours, sulfate, label='Data Points')

Now we plot residual vs fitted values:

for the original coordinates it is the same.

regression_curve = demo_exp_function(hours, *constants)

Then, we plot Residual vs Fitted Values (Original)

You might also like