Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views7 pages

Correlation Notes - Module3

The document provides an overview of correlation analysis, explaining statistical data types such as univariate and bivariate data, and the concept of correlation which measures the relationship between two variables. It outlines different types of correlation: positive, negative, zero, and perfect correlation, along with methods for studying correlation, including scatter diagrams and Pearson's coefficient. Additionally, it introduces Spearman's rank correlation coefficient for analyzing ranked data, detailing formulas and properties associated with these correlation methods.

Uploaded by

rushnafathimav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views7 pages

Correlation Notes - Module3

The document provides an overview of correlation analysis, explaining statistical data types such as univariate and bivariate data, and the concept of correlation which measures the relationship between two variables. It outlines different types of correlation: positive, negative, zero, and perfect correlation, along with methods for studying correlation, including scatter diagrams and Pearson's coefficient. Additionally, it introduces Spearman's rank correlation coefficient for analyzing ranked data, detailing formulas and properties associated with these correlation methods.

Uploaded by

rushnafathimav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CORRELATION ANALYSIS

• Statistical data : Statistical data refers to set of numbers collected for a


predetermined purpose.

• Univariate data :Statistical data providing information about a single


characteristic or variable (x) is called univariate data.
Eg : The marks of students in a class.

Marks(x)
12
23
18
30
42
24

• Bivariate data : Statistical data providing information about two


characteristics or variables (x,y) is called bivariate data.
Eg : The heights and weights of students in a class
Price and demand of different commodities.

Height(x) Weight(y)
160 62
158 45
171 68
165 70
157 56

CORRELATION

Correlation is a statistical device which measures the degree or strength of


relationship between two variables.
A high correlation means there is a strong relationship between the variables
and a low correlation means the realationship between the variables are
weak.

TYPES OF CORRELATION

• Positive correlation : If in a bivariate data , the values of two variables move


in the same direction , the correlation is said to be positive.
That is when the value of the variable x increases ,the value of the variable y
also increases or when x decreases y also decreases , then the correlation is
said to be positive.

• Negative correlation : If in a bivariate data , the values of two variables


move in the opposite directions , the correlation is said to be negative.
That is when the value of the variable x increases , the value of y decreases
or when x decreases y increases ,then the correlation is said to be negative.

• Zero correlation or no correlation : When there is no association between the


two variables x and y then there is no correlation or zero correlation
between x and y .

• Perfect correlation : If the values of one variable is proportional to the


values of other variable , then the correlation is said to be perfect.
If the values are directly proportional then the correlation is perfectly
positive

Methods of studying correlation


• Scatter diagram or Scatter plot
• Karl Pearson’s correlation coefficient
• Spearman's rank correlation coefficient
Scatter diagram
Scatter diagram is a graphical method of studying correlation. In this method the X
values are marked along the X axis and Y values are marked along the Y axis. The
points corresponding to the pair of values (X, Y) are plotted in the graph.
• If the points in the scatter diagram are lying close together then we can say
that the correlation is strong otherwise the correlation is weak.

Strong correlation weak correlation

• If the points of the scatter diagram moves in upward direction then the
correlation is said to be positive.

Positive correlation

• If the points of the scatter diagram are moving in a downward direction then
the correlation is said to be negative.

Negative correlation
• If the points of the scatter diagram are lying in a straight line then the
correlation is said to be perfect.

Perfect positive correlation Perfect negative correlation

• If the points in a scatter diagram are very much scattered then we say that
there is no correlation between X and Y or X and Y are said to
uncorrelated.

Zero correlation

KARL PEARSON’S COEFFICIENT OF CORRELATION


Karl Pearson’s coefficient of correlation is a mathematical method for studing
correlation.
Karl Pearson’s coefficient of correlation is a real number lying between -1 and +1
which tells us the degree or strength of relationship between two variables. It is
denoted by the letter ‘r’ or ‘rxy’.
The formula for Karl Pearson’s coefficient of correlation is given by
𝐶𝑜𝑣(𝑥,𝑦) 𝐶𝑜𝑣(𝑥,𝑦)
r= =
√𝑉(𝑥)√𝑉(𝑦) 𝜎𝑥 𝜎𝑦

where Cov(x,y) is the covariance between x and y and


1
Cov(x,y) = 𝑛 ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅)
V(x) is the variance of x and
1
V(x) = ∑(𝑥 − 𝑥̅ )2
𝑛

V(y) is the variance of y and


1
V(y) = 𝑛 ∑(𝑦 − 𝑦̅)2

On simplification , the formula for Karl Pearson’s correlation coefficient can be


written as
𝑛 ∑ 𝑥𝑦−∑ 𝑥 ∑ 𝑦
r = √𝑛 ∑ 𝑥²−(∑ 𝑥)² √𝑛 ∑ 𝑦²−(∑ 𝑦)²

Note :
• r = +1 means there is perfect positive correlation between x and y
• r = -1 means there is perfect negative correlation between x and y
• r = 0 means there is no correlation between x and y
• r > 0 (r is positive) means there is positive correlation between x and y
• r < 0(r is negative) means there is negative correlation between x and y

Properties of correlation coefficient


• The value of correlation coefficient lies between -1 and +1 (-1 ≤
rxy ≤ +1)
• rxy = ryx , that is correlation coefficient is symmetric.
𝑥−𝑎 𝑦−𝑐
• Let u = 𝑏 and v = 𝑑 where a,b,c,d are constants then ruv = rxy , that is
correlation coefficient is unaltered by change of origin and scale.
• Correlation coefficient between two independent variables is zero.
But correlation coefficient is zero does not always mean that the variables
are independent.

SPEARMAN’S RANK CORRELATION COEFFICIENT


Karl Pearson’s correlation coefficient which we have discussed earlier measures
the correlation coefficient between the values/magnitudes of two sets of variables.
If we are given the ranks of the variables instead of their values, we use
Spearman’s rank correlation coefficient.
Spearmans’s rank correlation coefficient measures the correlation between two sets
of ranks.
Usually qualities like beauty , intelligence , sincerity etc cannot be measured
directly. Instead they can be given ranks. In such cases we can use Spearman’s
rank correlation coefficient to measure their degree of relationship.
The formula for Spearman’s rank correlation coefficient is given by
6 ∑ 𝑑2
R=1− , where d= difference between the ranks and
𝑛(𝑛2 −1)

n = no. of pairs of observations

SPEARMAN’S RANK CORRELATION COEFFICIENT FOR REPEATED


RANKS / TIED RANKS
When there is a tie in the ranks , the formula for calculating Spearman’s rank
correlation coefficient is given by
𝑚3 −𝑚
6[∑ 𝑑2 +∑( 12 )]
R=1– , where d =difference between the ranks ,
𝑛(𝑛2 −1)

m = no. of times each rank repeats and


n= no. of pairs of observations

You might also like