Simple Linear Regression
Example
1
The Model
• The first order linear model
y b0 b1x e
y = dependent variable b0 and b1 are unknown population
x = independent variable y parameters, therefore are estimated
from the data.
b0 = y-intercept
b1 = slope of the line
Rise
e = error variable b1 = Rise/Run
b0 Run
x
2
The Estimated Coefficients
Alternate formula for the slope b1
To calculate the estimates of the slope and
intercept of the least squares line , use the sy
formulas: b1 r
sx
SS xy
b1 The regression equation that estimates
SS xx
the equation of the first order linear model
b0 y b1 x is:
SS xy xy
x y
i i
i i
n
x
ŷ b0 b1 x
2
SS xx x (n 1) sx2
2 i
i
n
3
The Simple Linear Regression Line
• Example:
– A car dealer wants to find
the relationship between
Car Odometer Price
the odometer reading and 1 37388 14636
the selling price of used cars. 2 44758 14122
3 45833 14016
– A random sample of 100 4 30862 15590
cars is selected, and the data 5 31705 15568
recorded. 6 34010 14718
. .
Independent .
Dependent
– Find the regression line. . .
variable .
x variable y
. . .
4
The Simple Linear Regression Line
• Solution
– Solving by hand: Calculate a number of statistics
x
2
x 36,009 .45; SS xx xi 43,528, 690
2 i
y 14 ,822 .823 ; SS xy ( xi yi )
xi yi
2, 712,511
n
where n = 100.
SS xy 2, 712,511
b1 .06232
(n 1) sx2 43,528, 690
b0 y b1 x 14,822.82 (.06232)(36, 009.45) 17, 067
ŷ b0 b1x 17,067 .0623 x 5
Assessing the Model
• The least squares method will produces a
regression line whether or not there is a linear
relationship between x and y.
• Consequently, it is important to assess how well
the linear model fits the data.
• Several methods are used to assess the model.
All are based on the sum of squares for errors,
SSE.
6
Sum of Squares for Errors
– This is the sum of differences between the points
and the regression line.
– It can serve as a measure of how well the line fits the
data. SSE is defined by
n
SSE
i 1
( y i ŷ i ) 2 .
– A shortcut formula
SSE yi2 b0 yi b1 xi yi
7
Standard Error of Estimate
– The mean error is equal to zero.
– If se is small the errors tend to be close to zero
(close to the mean error). Then, the model fits the
data well.
– Therefore, we can, use se as a measure of the
suitability of using a linear model.
– An estimator of se is given by se
S tan dard Error of Estimate
SSE
se
n2 8
Standard Error of Estimate,
Example
• Example:
– Calculate the standard error of estimate for the previous
example and describe what it tells you about the model fit.
• Solution
SSE 9, 005, 450
SSE 9, 005, 450
se 303.13
n2 98
It is hard to assess the model based
on se even when compared with the
mean value of y.
s e 303.1 y 14,823 9
Testing the Slope
• We can draw inference about b1 from b1 by testing
H0: b1 = 0
H1: b1 = 0 (or < 0,or > 0)
– The test statistic is
b1 b1 se
t where sb1
s b1 SS xx
The standard error of b1.
– If the error variable is normally distributed, the statistic
is Student t distribution with d.f. = n-2. 10
Testing the Slope,
Example
• Example
– Test to determine whether there is enough evidence
to infer that there is a linear relationship between the
car auction price and the odometer reading for all
three-year-old Tauruses in the previous example .
Use a = 5%.
11
Testing the Slope,
Example
• Solving by hand
– To compute “t” we need the values of b1 and sb1.
b1 .0623
se 303.1
sb1 .00462
(n 1) s x2 (99)( 43,528,690)
b1 b1 .0623 0
t 13.49
sb1 .00462
– The rejection region is t > t.025 or t < -t.025 with n = n-2 = 98.
Approximately, t.025 = 1.984
12