Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views7 pages

9 Regression and Correlation Methods 5 2023

The document discusses multiple regression analysis, detailing the relationship between a dependent variable and multiple independent variables, including the formulation of the regression model and interpretation of coefficients. It provides examples of applying multiple regression using Statistica, including parameter estimation, prediction, and significance testing of the model and its parameters. Additionally, it includes a residual analysis indicating no significant departure from normality.

Uploaded by

asemahlelolly19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views7 pages

9 Regression and Correlation Methods 5 2023

The document discusses multiple regression analysis, detailing the relationship between a dependent variable and multiple independent variables, including the formulation of the regression model and interpretation of coefficients. It provides examples of applying multiple regression using Statistica, including parameter estimation, prediction, and significance testing of the model and its parameters. Additionally, it includes a residual analysis indicating no significant departure from normality.

Uploaded by

asemahlelolly19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Multiple Regression

I Multiple regression analysis involves studying the relationship between a


dependent variable (y ) and k independent variables denoted
x1 , x2 , . . . , xk .
I The true model is written as
y = α + β1 x1 + β2 x2 + ... + βk xk + e
k
P
OR y = α + βi xi + e
i=1
I where e is an error term assumed to be normally distributed with
mean 0 and variance σ 2
I The fitted model is given by byi = a + b1 x1 + b2 x2 + ... + bk xk
I the coefficients b1 , b2 , ..., bk are called partial-regression coefficients
I For example, if there are two independent variables x1 and x2 then the
fitted model is b
y = a + b1 x1 + b1 x2
I Interpretation of the coefficients:
I βi represents the average change in y per unit change in xi with all
other variables held constant (or after adjusting for all the other
variables in the model)
Example 11.39, pg 469
Suppose SBP, birthweight (oz), and age are measured for 16 infants and the data are
shown below.

1. Estimate the parameters of the multiple-regression model using Statistica.


2. Predict the average SBP of a baby with birthweight of 128 measured at 3 days
of life.
3. Interpret the estimates of the birthweight and age parameters.
4. Comment on the significance of overall regression line.
5. Comment on the significance of the estimates of the each of the parameters.
Example 11.39, pg 469 (solution)
STATISTICA output (parameter estimates)

Regression Summary of Dependent Variable

R = 0.93856813, R 2 = 0.88091013, Adjusted R 2 = 0.86258861

F (2, 13) = 48.081, p < 0.0000, Std.Error of estimate: 2.4792

N = 16 b∗ Std.Err of b ∗ b Std.Err of b t (13) p − value

Intercept 53.45019 4.531889 11.79424 0.00000

Birthweight 0.352077 0.096263 0.12558 0.034336 3.65746 0.002896

Age 0.833231 0.096263 5.88772 0.680205 8.65580 0.000001

1. The model b
yi = 53.45019 + 0.12558birthweight + 5.88772Age
I b1 = 0.12558 represents the average increase in SBP per unit increase in
birthweight for infants of the same age (or holding the age constant)
I b2 = 5.88772 represents the average increase in SBP per unit increase in
age for infants of the same birthweight
2. by = 53.45019 + 0.12558 (128) + 5.88772 (3) = 87.1876mmHg
3. As (1) above
bi
4. The t(13) are for testing H0 : βi = 0 vs H1 : βi 6= 0 with T = se(b∼ tn−k−1
i)
For Birthweight Tobs = 3.65746 > t13,0.025 = 2.16 and
p − value = 0.002896 < α = 0.05
For Age Tobs = 8.6558 > t13,0.025 = 2.16 and p − value = 0.000001 < α = 0.05
I Note: R = 0.9386 is the multiple correlation between SBP and (combined) age and birthweight and
R 2 = 0.8809 gives the proportion of variability in SBP that is explained in the combined variation in age
and birthweight.
Example 11.39, pg 469 (solution)
STATISTICA output (ANOVA)

Analysis of Variance; Dependent Variable: SBP


Effect df SS MS F p − value
Regression 2 591.0356 295.5178 48.08063 0.000001
Residual 13 79.9019 6.1463
Total 15 970.9375

For testing
H0 : The model is not significant i.e β1 = β2 = ... = βk = 0 vs H1 : The model
is significant i.e at least of the βi 6= 0.
F = Reg MS
Res MS
∼ Fk,n−k−1
Reject H0 if Fobs > Fk,n−k−1,α
Fobs = 48.08063 compare with F2,13,0.05 = 3.81
p − value = P (Fk,n−k−1 > Fobs ) = P (Fk,n−k−1 > 48.08063) = 0.000001 <<
P (F > 6.7) = 0.01
Since Fobs = 48.08063 > F2,13,0.05 = 3.81 and p − value < α = 0.05 we reject
H0 and conclude that at the 5% level of significance these data provide
sufficient evidence that the model is significant.
I 0
The t(13) s are for testing H0 : βi = 0 vs H1 : βi 6= 0 with T = bi
se(bi )
∼ tn−k−1
Example
For a random sample of n = 20 individuals, we have measurements
of y = body fat, x1 = triceps skinfold thickness,
x2 = thigh circumference, and x3 = midarm circumference to study
the relationship one’s skinfold thickness, thigh circumrence and
midarm circumference and one’s bodyfat (data will be provided).
1. Use Statistica to obtain a multiple linear regression model for
these data and intepret the estimates of the parameters.
2. Comment on the overall significance of the regression model.
3. Comment on the significance of the regression parameters.
Example
Solution

Regression Summary for Dependent Variable: Bodyfat


R = 0.90224579 R 2 = 0.81404747
F(3,16) = 23.348 p < 0.00001 Std.Error of estimate 1.9191
b∗ Std.Err b Std.Err t(16) p − value
N = 20 of b ∗ of b
Intercept -4.42052 3.553112 -1.24413 0.231372
Triceps -0.004885 0.199345 0.00397 0.162068 -0.02451 0.980752
Thigh 0.746619 0.182422 0.33392 0.081586 4.09282 0.000849
Medarm 0.356674 0.124799 0.37082 0.129749 2.85797 0.011391

1. The model is
yi = −4.4205 + 0.00397x1 + 0.33392x2 + 0.37082x3
b
2. H0 : The model is not significant i.e β1 = β2 = β3 = 0 vs H1 : The model is
significant i.e at least one of the βi 6= 0. Reject H0 if Fobs > Fk,n−k−1,α
Reject H0 if Fobs > Fk,n−k−1,α thus Fobs = 23.348 compare with
F3,16,0.05 = 3.24, p − value = 0.00001 < α, thus model is significant.
3. Looking at the p − values, the intercept and Triceps are insignificant as
p − values ≮ α. Thigh and Midarm circumference are significant as
p − values < α.
Residual Analysis

There is no evidence of significant departure from normality as the


points on the normal probability plot are close to the hypothetical
normal line (straight line). In addition the histigram shows the
shame of a distribution which is close to normal.

You might also like