Correlation
Correlation
• Sir Francis Galton (Uncle to
Darwin
– Development of behavioral statistics
– Father of Eugenics
– Science of fingerprints as unique
– Retrospective IQ of 200
– Drove himself mad just to prove
you could do it
– Invented the pocket
Defining Correlation
• Co-variation or co-relation between two
variables
• These variables change together
• Data scale (interval or ratio) variables
• http://www.youtube.com/watch?v=ahp7QhbB8G4
Correlation Coefficient
• A statistic that quantifies a relation between
two variables
• Can be either positive or negative
• Falls between -1.00 and 1.00
• The value of the number (not the sign)
indicates the strength of the relation
Linear Correlation
Linear relationships Curvilinear relationships
Y Y
X X
Y Y
X X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Linear Correlation
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Linear Correlation
No relationship
X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall
Correlation
9
Positive Correlation
Association between variables such that high
scores on one variable tend to have high
scores on the other variable
A direct relation between the variables
Negative Correlation
Association between variables such that high
scores on one variable tend to have low
scores on the other variable
An inverse relation between the variables
A Perfect Positive Correlation
A Perfect Negative Correlation
What is “Linear”?
Remember this:
Y=mX+B?
B
What’s Slope?
A slope of 2 means that every 1-unit change in
X yields a 2-unit change in Y.
Simple linear regression
P=.22; not
significant
The linear regression model: intercept
Love of Math = 5 + .01*math SAT score
slope
Check Your Learning
• Which is stronger?
– A correlation of 0.25 or -0.74?
.25 is positive weak – as x increase y slightly
increase
-.74 is negative strong – as x increase y decrease
Misleading Correlations
• Something to think about
– There is a 0.91 correlation between ice cream
consumption and drowning deaths.
• Does eating ice cream cause drowning?
• Does grief cause us to eat more ice cream?
Correlation
Correlation is NOT
causation
-e.g., armspan and
height
21
The Limitations of Correlation
• Correlation is not causation.
– Invisible third variables
Three Possible
Causal
Explanations for a
Correlation
The Limitations of Correlation,
cont.
> The effect of an outlier.
One individual who both studies and uses her cell
phone more than any other individual in the
sample changed the correlation from 0.14, a
negative correlation, to 0.39, a much stronger and
positive correlation!
The Pearson Correlation Coefficient
• A statistic that quantifies a linear relation
between two scale variables.
• Symbolized by the italic letter r when it is a
statistic based on sample data.
• Symbolized by the italic letter p “rho” when it
is a population parameter.
• Pearson correlation coefficient
–r
– Linear relationship
r
[( X M X )(Y M Y )]
( SS X )( SSY )
Correlation Hypothesis
Testing
• Step 1. Identify the population, distribution, and
assumptions
• Step 2. State the null and research hypotheses.
• Step 3. Determine the characteristics of the
comparison distribution.
• Step 4. Determine the critical values.
• Step 5. Calculate the test statistic
• Step 6. Make a decision.
ACTIVITY 7