0% found this document useful (0 votes)

28 views34 pages

Simple Linear Regression Analysis

Uploaded by

Haliyah Musibau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views34 pages

Simple Linear Regression Analysis

Uploaded by

Haliyah Musibau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Medical Statistics

Topic : Simple Regression analysis

Lecture by
Dr. Chindo Ibrahim bisallah
MB.BS, MPH, MPA, PhD

Department of Community Medicne, Faculty of

Health Sciences
Ibrahim Badamasi Babangida University Lapai.
REGRESSION
ANALYSIS
Introduction
• Regression analysis is a statistical method used to
establish a relationship between two or more variables.

• It can also be defined as a statistical method to model the

relationship between a dependent variable and one or
more independent variables.

• Purpose: Prediction, inference, and understanding

relationships
Historical Background
Regression analysis has evolved significantly over time:
• 19th Century:
Early correlation studies for understanding relationships between
variables was Pioneered by Francis Galton. He began the exploration of
statistical relationships on heredity and human characteristics
• Early 20th Century:
Karl Pearson and other statisticians formalized the concepts of correlation
and regression, establishing methods that are still in use today.
• Late 20th Century to Present:
With advances in computing power, regression methods have expanded
into complex multivariate and non-linear models, integrating into modern
machine learning techniques.
Basic terminology
• Dependent Variable (Outcome) -The outcome or response
variable that the model seeks to predict or explain
• Independent Variable (Predictor)- The predictor or
explanatory variable used to forecast or explain the
dependent variable
• Coefficient - A numerical value that quantifies the
relationship between an independent variable and the
dependent variable; it indicates the magnitude and direction
of the effect
• Intercept - The expected value of the dependent variable
when all independent variables are zero; essentially, it is the
starting point of the regression line.
Types of regression analysis
1. Linear Regression
Simple Linear Regression
Models the relationship between a single independent variable and a
dependent variable using a straight line.
Multiple Linear Regression
Extends simple linear regression to include multiple independent
variables.
2. Logistic Regression
Used when the dependent variable is categorical (often binary e.g smoker or
not smoker, disease or not disease, dead or alive). It estimates the probability
of a certain class or event. logistic regression is useful in analysing binary
outcomes and identifying factors that influence the probalility of an event
occurring.
3. Polynomial Regression
A form of linear regression where the relationship between the independent
variable(s) and the dependent variable is modeled as an nth degree polynomial
4. Regularized Regression
Ridge Regression and Lasso Regression
Correlation vs Regression
1. The correlation coefficient is independent of the units
of measurement.
2. The regression coefficients (slope and intercept) will
change as units of measurement change

3. Furthermore, the regression of X on Y is not the same as the

regression of Y on X--- which variable is dependent and which is
independent matter

4. In contrast, the correlation of Y with X is the same as the correlation

of X with Y.
5. The main difference is that Regression looks at the change in one
variable ( the response, outcome or dependant variable) that
corresponds to a given change in the other ( explanatory, predictor or
independant varaible)

6. The objective is to predict or estimate the value of the response

associated with a fixed value of the explanatory variable

7. Correlation analysis does not distinguish between the two variables

Simple Vs Multiple Linesr Regression

• A linear regression model attempts to explain the relationship

between two or more variables using a straight line.

• Simple linear regression seeks to predict an outcome

(dependent) variable values from a single independent variable.

• Whereas Multiple linear regression seeks to predict one

outcome (dependent) variable from several independent
variables.

• Here we are examining for a dependence of one variable

(dependent variable) on the other independent variable (s).
• Simple Vs Multiple Linesr Regression contd.

• Two important steps in regression analysis involve examining a

scatter plot of two variables and calculating the correlation
coefficient.
• The relationship is summarized by a regression equation
consisting of a slope and an intercept.

• The equation for the regression line in simple regression is y

= α + βx;
where α = intercept at Y axis when x=0
and β = slope (regression coefficient).
The slope represents the amount the dependent variable increases
with each unit increase in the independent variable.
What to Expect in Regression Analysis
Derive Regression / Prediction Equation

Hypothesis Testing

Regression Slope
What to Expect in Regression Analysis
Derive Regression / Prediction Equation

Hypothesis Testing

Regression Slope
What to Expect
• Derive Regression / Prediction
Equation
Comparable to equation of straight line
y = α + βx
Y

Intercept Slope β Δ
α
Prediction equation is in the form:
---------- Y
y = α + βx where
ΔX
α is the intercept of the line, or the mean value of
the response y when x is equal to 0
β is the slope, or the change in y that corresponds X
to a one unit change in x
Interpret the prediction equation
• The model represents a straight line
• The line y = α + βx, is called the regression line
• The parameters α and β are constants and are the coefficients
of the equation
• α is the intercept of the line, or the mean value of the
response y when x is equal to 0
• β is the slope, or the change in y that corresponds to
a one
unit change in x
• If β is positive, then the expected value of y increases
in magnitude as x increases
• If β is negative, the expected value of y decreases as
x increases
Simple linear regression
• Simple linear regression is a method used to model the relationship
between two continuous variables—one dependent (response) variable
and one independent (predictor) variable.
the Prediction equation = y = β₀ + β₁x + ε
• where:
• y: Dependent variable (the outcome you're trying to predict).
• x: Independent variable (the predictor).
• β₀ (Intercept): The expected value of y when x is 0.
• β₁ (Slope): The change in y for a one-unit change in x.
• ε: The error term accounting for the difference between the observed and
predicted values.
Assumption of simple Linear Regression
1. Continous data (data is interval or ratio).
2. The relationship between the two variables is linear, meaning the
data points on a scatterplot should generally form a straight line
3. No Significant Outliers in your data
4. The data you are analyzing needs to be normally distributed
5. Homoscedasticity (Equal Variances). The spread (variance) of one
variable should be roughly the same across the values of the other
variable
Evaluation of the
Regression Model
• Now that the least squares regression line has been
determined, how well does the model actually fit the
observed data?
• One way to evaluate the fit of a model is to compute
the coefficient of determination
• R2 is the square of the Pearson correlation
coefficient
• R2 can be interpreted as the proportion of the
variability among the observed values of y
that is explained by the linear regression of y
on x
Components of Simple Linear Regression

1. Descriptive Component
• Regression Equation : Ỳ = bo + biXi
• Correlation Coefficient : r
• Coefficient of determination : r2
2. Inferential Component (Hypothesis testing)
- Regression Model
- Slope
bo = y intercept
bi = slope
1. Regression
F RATIO
Model
Source SS df MS F

HYPOTHESIS TESTING
Regression SSR p MSR MSR/ MSE
Error SSE n-p-1 MSE

Total SST n-1

2. Slope t test
Regression Model
This is testing whether there's a relationship between the dependent
variable (Y) and an independent variable (X₁)
Null Hypothesis (H₀), Y=β0+ei
 Alternative Hypothesis (Hₐ), Y=β0 +β1X1+ei
This assumes that X₁ does influence Y, as indicated by the slope coefficient β1

This explains how hypothesis testing is used in regression to determine if an

through analysis of the slope 𝛽1

independent variable (X₁) significantly predicts the dependent variable (Y),

.
This is a more specific test of whether the slope 𝛽1 (the effect of X₁ on Y)
Slope

is significantly different from zero

• Null Hypothesis (H₀):𝛽1=0

→ No relationship between X₁ and Y.

• Alternative Hypotheses (Hₐ)

β 1=0: There is some relationship
β 1>0: A positive relationship
β 1<0: A negative relationship
r = coefficient correlation
r2 = Coefficient of determination
r2 = SS explained
SS Total

Coefficient of determination r2 is the proportion of

the total variation that could be accounted by the
linear relationship between x and y .

r2
= ... x 100 = .. %
Important steps in regression analysis using SPSS:
a. Conducting a Bivariate Simple Linear Regression Analysis
Select the Analyzed menu, click Regression, then click Linear. Select systolic blood
pressure, then click arrow to move it to the Dependent box. Click weight, then click
arrow to move it to the Independent(s) box. Click Statistics, then click Descriptives.
Make sure that Estimates and Model Fit are also selected. Click Continue, and then
OK.

b. Scatterplot with Regression Line

The result of the regression analysis can be summarized using scatter-plot with
regression line.
Click Graph, then click Scatter. Click Simple, then click Define. Click systolic
blood pressure and click arrow to move it to the Y axis box. Click weight and click
arrow to move it to the X axis box. Click OK.
Once you have created a scatter-plot showing the relationship between the weight and
systolic blood pressure, you can add a regression line by following these steps.
Double click on the chart to select it for editing. Click Add Fit Line at Total to open
the
Properties box. Click Elements and Fit Line and Linear.
c. Assumptions for the linear regression model – residual analysis

For linear regression model to be valid, there are there assumptions to be checked on the
residues:
No outliers.
The data points must be independent.
The distribution of these residuals should be normal with mean = 0 and a constant variance.

i) Checking outlier
Select the Analyzed menu, click Regression, then click Linear. Select systolic blood pressure,
then click arrow to move it to the Dependent box. Click weight, then click arrow to move it to
the Independent(s) box. Click Statistics, then tick on Casewise Diagnostic box……
Our interest is in the Standardised Residual; making sure that the minimum and maximum
values do not exceed ±3.
ii) Checking independence
Run the analyses as above, tick on the Durbin-Watson box after click Statistic….
iii) Checking the normality assumptions of the residuals
Run the analyses as in 3(i), click on the Plot folders, tick on the Histogram and Normal
probability plot to check the normality assumptions of the residues.
iv) Checking for constant variance
Run the analyses as in 3(i), click on the Plot folders, select *ZRESID (Regression Standardized
Residual) into the Y box and *ZPRED (Regression Standardized Predicted Value) into the X
box.
As long as the scatter of the points shows no clear pattern, then we can conclude that the
variance is constant
Q. Hypothesis testing
We wish to test if the sample value of ‗r‘ is of sufficient magnitude that, in the population, SBP and Age
are correlated.

We conduct the hypoteheis test as follow:

1. Data :
2. Assumptions:
For each value of x there is a normally distributed subpopulation of y values
for each value of y there is a normally distribution of x values
the joint distribution of x and y is normally a distribution called the bivariate normal distribution
the subpopulation of y value all have the same variance
the subpopulation of x value all have the same variance
3. Hypothesis : Ho : ρ = 0 and HA = ρ ≠ 0
4. Test statistic: When, ρ = 0 it can be shown that the appropriate test statistic is
5.Distribution of test statistic: When Ho is true and the assumptions are met, the test statistic
is distributed as t distribution with a n-2 degrees of freedom.
6.Decision rule: Let α =0.05, If from our data the computed t value is either greater than or equal to the t
value (+..) in the t table (with the n-2 degrees of freedom) Or less than or equal to (minus …), we reject
the null hypothesis
7. Calculation of t test statistic: Our calculated t value is t =
8.P value : Since our computed t value = ‗……..‘ which is > or < than ‗……‘ (interpolated value from
the table) , we have for this test p <0.05 ( or p> 0.05 depending on the case)
9.Statistic decision: Since the computed value of t does exceed the critical value of t value, we reject the
null hypothesis. However, if the computed value of t does not exceed the critical value of t value, we do
not reject the null hypothesis.
10. Conclusion : We conclude that, in the population, the two variables ‗X‘ and ‗Y‘ is linearly relat9e1d.
Using SPSS to draw a
scatterplot with a
regression line
• Graph  Scatter  Simple  x and y axis
variables  OK
• In the graph, double click then  Options
 reference line
Data of 30 students with their Ages and SBP for Regression Analysis

Age: 25, 30, 22, 45, 50, 35, 40, 28, 33, 38, 60, 55, 48, 42, 26, 31, 29, 37,
43, 34, 27, 36, 39, 32, 41, 46, 44, 52, 49, 24
SBP: 120, 122, 115, 135, 140, 128, 132, 118, 125, 130, 150, 145, 138,
134, 117, 123, 119, 129, 136, 127, 116, 126, 131, 124, 133, 137, 139,
142, 141, 114
REFERENCES
1 Daniel, W.W. (2009). Biostatistics: A Foundation for Analysis in Health
Sciences (9th Edition). Boston: John Wiley and Sons
2. Rosner, B. (2006). Fundementals of Biostatistics (6th Edition).
Duxbury: Thompson Learning Publishers

3. Jennifer Peat , Belinda Barton (2008) Medical Statistics- A Guide to

data analysis. Blackwell Publishers.

4. Jekel J et al. (2007). Epidemiology, Biostatistics and

Preventive Medicine. Third Ed. Philadelphia: Elsevier Saunders

Simple and Multiple Regression Analysis
No ratings yet
Simple and Multiple Regression Analysis
46 pages
Chapter 6 Student
No ratings yet
Chapter 6 Student
21 pages
QT - Unit 2 - Part B - Regression
No ratings yet
QT - Unit 2 - Part B - Regression
40 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
Correlation and Regression Notes
No ratings yet
Correlation and Regression Notes
5 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Copyofcopyof1lec25 27simplelinearregression 231224065709 c7c439d0
No ratings yet
Copyofcopyof1lec25 27simplelinearregression 231224065709 c7c439d0
31 pages
Topic:-Regression: Name: - Teotia Nidhi Class: - M.SC Biotechnology
No ratings yet
Topic:-Regression: Name: - Teotia Nidhi Class: - M.SC Biotechnology
11 pages
Regression Analysis Using SPSS: DR Somesh K Sinha
100% (1)
Regression Analysis Using SPSS: DR Somesh K Sinha
17 pages
Regression Analysis Overview
No ratings yet
Regression Analysis Overview
15 pages
Biostatistics Regression Guide
No ratings yet
Biostatistics Regression Guide
10 pages
Practical Biostatistics BMB-308: Torial Port and Presentation
No ratings yet
Practical Biostatistics BMB-308: Torial Port and Presentation
28 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Correlation
No ratings yet
Correlation
13 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
25 pages
Correlation and Regression Analysis - Updated
No ratings yet
Correlation and Regression Analysis - Updated
49 pages
Day 3
No ratings yet
Day 3
85 pages
F Regression
No ratings yet
F Regression
65 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
Chapter No 11 (Simple Linear Regression)
No ratings yet
Chapter No 11 (Simple Linear Regression)
3 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
14 pages
Intro to Correlation & Regression
No ratings yet
Intro to Correlation & Regression
71 pages
Regression Analysis
No ratings yet
Regression Analysis
21 pages
Probablity
No ratings yet
Probablity
4 pages
1.1.2simple Linear Regression
No ratings yet
1.1.2simple Linear Regression
14 pages
Regression Analysis - Malasa
No ratings yet
Regression Analysis - Malasa
39 pages
File4 Session3 Introduction To Regression
No ratings yet
File4 Session3 Introduction To Regression
50 pages
Inferential Analysis
No ratings yet
Inferential Analysis
45 pages
Regression Analysis (Simple)
100% (1)
Regression Analysis (Simple)
8 pages
MAP 716 Lecture 4 Simple Linear Regression
No ratings yet
MAP 716 Lecture 4 Simple Linear Regression
23 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Module 11. Lesson Proper
No ratings yet
Module 11. Lesson Proper
5 pages
Linear Regression Analysis Guide
No ratings yet
Linear Regression Analysis Guide
104 pages
Module - 05 Statistical Computing and R Programming
No ratings yet
Module - 05 Statistical Computing and R Programming
53 pages
Regression
No ratings yet
Regression
14 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
46 pages
MLR and Regression
No ratings yet
MLR and Regression
30 pages
Lecture Note #8 - PEC-CS701E
No ratings yet
Lecture Note #8 - PEC-CS701E
20 pages
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
No ratings yet
Common Pitfalls in Statistical Analysis: Linear Regression Analysis
4 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
65 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
Module 3 Regression Notes
100% (1)
Module 3 Regression Notes
3 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Share MBBS Lecture 5 (1) - 1
No ratings yet
Share MBBS Lecture 5 (1) - 1
40 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
DAM Class 21-24 Regression Analysis
No ratings yet
DAM Class 21-24 Regression Analysis
93 pages
Simple Linear Regression and Correlation PDF
No ratings yet
Simple Linear Regression and Correlation PDF
7 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Module 6
No ratings yet
Module 6
35 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
Regression Analysis in Healthcare
No ratings yet
Regression Analysis in Healthcare
3 pages
Unit-2 ML
No ratings yet
Unit-2 ML
39 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Regression Analysis Basics
No ratings yet
Regression Analysis Basics
12 pages
Guide To Getting Started With FemDom-Lilly
No ratings yet
Guide To Getting Started With FemDom-Lilly
6 pages
AMY PHILBRICK, LLC v. S/V NIOBE Et Al - Document No. 8
No ratings yet
AMY PHILBRICK, LLC v. S/V NIOBE Et Al - Document No. 8
3 pages
Hindu Medieval Salvation Islamic Sufism: Bhakti Movement
No ratings yet
Hindu Medieval Salvation Islamic Sufism: Bhakti Movement
1 page
WL1 - Construction Disputes - With Interim Certificate PDF
0% (1)
WL1 - Construction Disputes - With Interim Certificate PDF
6 pages
Tamilnadu XTH English Model Question Paper With Answer Key 2020
No ratings yet
Tamilnadu XTH English Model Question Paper With Answer Key 2020
6 pages
Viet Lai Cau
No ratings yet
Viet Lai Cau
10 pages
The 2019 2020 Tokushima Prefecture ALT Skill Development Conference
No ratings yet
The 2019 2020 Tokushima Prefecture ALT Skill Development Conference
9 pages
Importance of Technology Transfer
80% (10)
Importance of Technology Transfer
6 pages
Land Purchase Offer: 8100 Willow Springs
No ratings yet
Land Purchase Offer: 8100 Willow Springs
3 pages
Lecture 8
No ratings yet
Lecture 8
42 pages
Borromeo-Herrera V Borromeo
No ratings yet
Borromeo-Herrera V Borromeo
9 pages
Cover Note: Stamp Duty Paid
No ratings yet
Cover Note: Stamp Duty Paid
1 page
Lahaul-Spiti For Youtube
No ratings yet
Lahaul-Spiti For Youtube
8 pages
Lesson Plan Letter H
No ratings yet
Lesson Plan Letter H
5 pages
Summoning Primers
100% (2)
Summoning Primers
2 pages
Caasi, Cristine Jane R. (fs1 - w1)
No ratings yet
Caasi, Cristine Jane R. (fs1 - w1)
10 pages
Meet The Teacher Bio
No ratings yet
Meet The Teacher Bio
1 page
Determinants - BITSAT Previous Year Chapter-Wise Sheets - MathonGo
No ratings yet
Determinants - BITSAT Previous Year Chapter-Wise Sheets - MathonGo
6 pages
1058318-Phoenix Born v5
No ratings yet
1058318-Phoenix Born v5
2 pages
CLJ 2016 1 911 PSB
No ratings yet
CLJ 2016 1 911 PSB
68 pages
Vintage Aircraft Sextants for Sale
No ratings yet
Vintage Aircraft Sextants for Sale
1 page
Tinjauan Yuridis Tentang Upaya-Upaya Hukum Oleh Putra Halomoan HSB
No ratings yet
Tinjauan Yuridis Tentang Upaya-Upaya Hukum Oleh Putra Halomoan HSB
23 pages
Art Appreciation Module Overview
No ratings yet
Art Appreciation Module Overview
4 pages
Entrep - Branding
No ratings yet
Entrep - Branding
13 pages
Recurrent Pneumonia Final2
No ratings yet
Recurrent Pneumonia Final2
81 pages
The Begum S Millions 1st Edition Jules Verne Download
No ratings yet
The Begum S Millions 1st Edition Jules Verne Download
59 pages
Ethics 2nd Sem 2024 2025
No ratings yet
Ethics 2nd Sem 2024 2025
16 pages
Spring Boot Hospital List
100% (1)
Spring Boot Hospital List
25 pages
Ornaments Fingerings Authorship
100% (1)
Ornaments Fingerings Authorship
23 pages
Unit 8 Reading Comprehension
No ratings yet
Unit 8 Reading Comprehension
4 pages

Simple Linear Regression Analysis

Uploaded by

Simple Linear Regression Analysis

Uploaded by

Medical Statistics

Topic : Simple Regression analysis

Department of Community Medicne, Faculty of

• It can also be defined as a statistical method to model the

• Purpose: Prediction, inference, and understanding

3. Furthermore, the regression of X on Y is not the same as the

4. In contrast, the correlation of Y with X is the same as the correlation

6. The objective is to predict or estimate the value of the response

7. Correlation analysis does not distinguish between the two variables

• A linear regression model attempts to explain the relationship

• Simple linear regression seeks to predict an outcome

• Whereas Multiple linear regression seeks to predict one

• Here we are examining for a dependence of one variable

• Two important steps in regression analysis involve examining a

• The equation for the regression line in simple regression is y

Total SST n-1

through analysis of the slope 𝛽1

is significantly different from zero

• Null Hypothesis (H₀):𝛽1=0

• Alternative Hypotheses (Hₐ)

Coefficient of determination r2 is the proportion of

b. Scatterplot with Regression Line

We conduct the hypoteheis test as follow:

3. Jennifer Peat , Belinda Barton (2008) Medical Statistics- A Guide to

4. Jekel J et al. (2007). Epidemiology, Biostatistics and

You might also like