PEARSON PRODUCT
MOMENT CORRELATION
COEFFICIENT
Miguel Angelo Oboza Concio
History
KARL PEARSON
• Born March 27, 1857, London,
England—died April 27, 1936,
Coldharbour, Surrey), British
statistician, leading founder of the
modern field of statistics,
prominent proponent of eugenics,
and influential interpreter of the
philosophy and social role of
science.
In statistics, the Pearson correlation coefficient (PCC,
pronounced, also referred to as Pearson's , the Pearson product-
moment correlation coefficient (PPMCC) or the bivariate
correlation, is a measure of the linear correlation between two
variables X and Y. According to the Cauchy–Schwarz inequality it
has a value between +1 and −1, where 1 is total positive linear
correlation, 0 is no linear correlation, and −1 is total negative
linear correlation. It is widely used in the sciences. It was
developed by Karl Pearson from a related idea introduced by
Francis Galton in the 1880s and for which the mathematical
formula was derived and published by Auguste Bravais in 1844.
The naming of the coefficient is thus an example of Stigler's Law.
When we will use?
The Pearson product-moment correlation coefficient is a
measure of the strength of the linear relationship between two
variables. It is referred to as Pearson's correlation or simply as
the correlation coefficient. If the relationship between the
variables is not linear, then the correlation coefficient does not
adequately represent the strength of the relationship between
the variables
The symbol for Pearson's correlation is "ρ" when it is measured
in the population and "r" when it is measured in a sample.
Because we will be dealing almost exclusively with samples, we
will use r to represent Pearson's correlation unless otherwise
noted.
Pearson's r can range from -1 to 1. An r of -1 indicates a perfect
negative linear relationship between variables, an r of 0
indicates no linear relationship between variables, and an r of 1
indicates a perfect positive linear relationship between
variables. Figure 1 shows a scatter plot for which r = 1.
Why do you use Pearson correlation?
A Pearson's correlation is used when you want to find a linear
relationship between two variables. It can be used in a causal as
well as a associative research hypothesis but it can't be used with
a attributive RH because it is univariate.
Example Problem
The following example includes the changes we will need to
make for hypothesis testing with the correlation coefficient, as
well as an example of how to do the computations.
Below are the data for six participants giving their number
of years in college (X) and their subsequent yearly income (Y).
Income here is in thousands of dollars, but this fact does not
require any changes in our computations. Test whether there is a
relationship with Alpha = .05.
Notice that we have included the computation for obtaining the
summary values for you for completeness. Be sure we know
how to obtain all the summed values, as they will not
always be given on the exam.
Step 1: State the Hypotheses in Words and
Symbols
: •There
is a significant relationship between the number of years of
college and the yearly income of the six participants
: There is no significant relationship between the number of college
and the yearly income of the six participants
ρ≠0
:ρ=0
Step 2: Identify the level of Significance
Alpha = .05
Step 3: Identify the Test Statistic to be use
PEARSON PRODUCT MOMENT CORRELATION
COEFFICIENT
Step 4: Compute for the value
of the statistic to be used
r=• r=
r=
r=
r=
=
=
=.95
Step 5: Compute for the degrees of
freedom and get the critical value
df = n-2
df = 6 – 2
df= 4
= + 0.811
THE CRITICAL VALUE FOR PEARSON r
STEP 6: Decision Rule
Since the r computed value is greater than
the r tabular value accept the alternative
hypothesis and reject the null hypothesis
STEP 7: Conclusion
Since the r computed value of 0.95 is greater than the r
tabular valueof 0.811 accept the alternative hypothesis
and reject the null hypothesis, regardless of sign at 0.01
level of significance, the research hypothesis confirms
that there is no significant relationship between the
number of college and the yearly income of the six
participants