Correlation Analysis correlated with rate of interest.
or the
relationship between demand and price
The term univariate analysis refers to the
analysis of one variable. You can remember Measures of Correlation
this because the prefix “uni” means “one.”
We use following methods to measure
Eg Summary Statistics frequency simple correlation between two variables:
diagrams 1) Scatter Diagram 2) Karl Pearson’s
Coefficient of Correlation 3) Coefficient of
Bivariate analysis is one of the statistical
Rank Correlation
analyses where two variables are observed
Scatter Diagram
Correlation between two variables cross
tabs regression between two variables Its used to visualise the relationship
between variablesFor the purpose of
The term multivariate analysis refers to
drawing a scatter diagram we will take
the analysis of more than one variable.
one variable along the X axis and another
Corelation between more than two on the Y axis The way in which points on
variables multiple regression models the scatter diagram lie, indicate the nature
of relationship between two variables.
From scatter diagram, we do not get any
numerical measurement of correlation
Bivariate data set may reveal some kind of .
association between two variables x and y
and we may be interested in numerically
measuring the direction and degree of
strength of this association. Such a measure
can be performed with correlation.
Correlation measures the linear association
with the variables – by association we mean
both the direction and the intensity of
relationship
The direction of relationship are of two
Karl Pearson’s Correlation Coefficient
types -, movements of the two variables are
or Product Moment Correlation
in the same direction it is said that there
exists positive or direct correlation between Although a scatter diagram provides a
the variables. Relationship between supply pictorial understanding of the relationship
and price or relationship between between two variables, it fails to provide
consumption and income any numerical relationship. The Pearsonian
product moment correlation coefficient is
when the movements of two variables are
the most commonly used measure of
in opposite directions the correlation
correlation coefficient and it gives a
between those variables are said to be
numerical value of the extent of association
negative or inverse. For example,
between two variables. This is used when
investment is likely to be negatively
the data set is quantitative. This is
symbolically represented by γ and the
formula for it is given below
Degrees of correlation coefficients Coefficient of correlation (r) = 0: If there is
no relationship between the two variables,
The degree of intensity of relationship then the value of correlation will be zero.
between two variables is measured with the However, it does not imply that these two
coefficient of correlation. variables are independent. It only indicates
non-existence of linear relation between the
two variables.
1. Perfect correlation
3 Limited degree of correlation: A limited
Coefficient of correlation (r) = 1: If there is degree of correlation exists between perfect
perfect positive relationship between two correlation and zero correlation, i.e. the
variables, then the value of correlation will value of the coefficient of correlation lies
be +1. between +1 and −1. This limited degree of
Coefficient of correlation (r) = −1: If there correlation may be high, moderate or low.
is perfect negative relationship between two • High degree of correlation:
variables, then the value of correlation will Correlation of two series of data is
be −1. closer to one.
• Medium degree of correlation:
Correlation of two series of data is
neither large nor small.
• Low degree of correlation:
Correlation of two series of data is
small if its close to 0
2. Zero correlation: If two variables have
no relationship between them, then the
correlation is zero. It implies that a change
in the value of one variable has no effect on
the change in the value of the other variable.
Simple correlation analysis deals with two
variables only and it explores the extent of
linear relationship between them (if x and y
are linearly related Simple correlation
analysis may not give the true nature of
association between two variables in such
an event. Ideally, one should take out the
effect of the 3rd variable on the first two
and then go on measuring the strength of
association between them. But this is not
Properties of correlation coefficient possible under simple correlation analysis.
1 Correlation ‘r’ has no unit, it is a pure In such situations, we use partial and
number. It means units of measurement are multiple correlations
not part of ‘r’. In simple correlation analysis, we assume
2 A negative value of ‘r’ indicates an linear relationship between two variables
inverse relation. A change in one variable is but there may exist non-linear relationship
associated with change in the other variable between them. In that case, simple
in the opposite direction.. If ‘r’ is positive correlation measure fails to capture the
the two variables move in the same association.
direction.
3. The value of the correlation coefficient Coefficient of Rank Correlation -
lies between minus one and plus one. (-1 ≤ Spearman
r ≤ 1). If r = 0, the two variables are
uncorrelated. There is no linear relation The Karl Pearson’s product moment
between them. However, other types of correlation coefficient cannot be used in
relations may be there. If r =1 or r = -1, the cases where the direct quantitative
correlation is perfect. The relation between measurement of the variables is not
them is exact. A low value of ‘r’ indicates possible. (For example, consider honesty,
a weak linear relation. Its value is said to be efficiency, intelligence, etc.). The
low when it is close to zero. Spearman's rank-order correlation is the
nonparametric version of the Pearson
4 The magnitude of ‘r’ is unaffected by the product-moment correlation. Spearman's
change of origin and change of scale. correlation coefficient, (ρ, Rho ) measures
This means, if u and v are two new variables the strength and direction of association
defined as: U= between two ranked variables.
𝑥−𝑐 y−d
U= v= When Ranks are not repeated
ℎ k
where c, d, h and k are arbitrary constants,
then correlation coefficient between u and v
(r uv) will be same as correlation coefficient
between x and y (rxy), i.e.,ruv = rxy.)
where D is the absolute difference between
the ranks of an individual, n is the number
Limitations of Simple Correlation of pairs ranked.
When Ranks are repeated
6(𝛴D^2 + CF)
ρ = 1-- CF
n(n^2-1)
m(m^2-1)
=
12
Applications of Spearman’s Rank r 13.2- correlation between x1 and x3 after
correlation coefficient eliminating the linear effect of X2 on x1 and
Students’ ratings by department on a X3
qualitative thing by two faculty members.
Ex 1 This analysis will help the institution
to assess the consistency of the ratings
provided by the two teachers towards a r23.1= - correlation between x2 and x3 after
qualitative thing a group of student If the eliminating the linear effect of X1 on x2 and
ranking given by both observers is similar, X3
the spearmans rank correlation will be
positive and otherwise negative.If its
positive the organization can put more faith
in the ratings than if the observer ranking
varies widely from one to the other. This
will also reduce the chance of biased or
unethical ranking systems.
Partial Correlation Coefficient
Partial correlation is the correlation
between two variables, after removing the
linear effects of other variables on them.
Let us consider the case of three variables
X1 , X2 and X3 . Sometimes the correlation
between two variables X1 and X2 may be
partly due to the correlation of a third
variable X3 with both X1 and X2 . In this
type of situation one may be interested to
study the correlation between X1 and X2
when the effect of X3 on each of X1 and X2
is eliminated. This correlation is known as
partial correlation. The correlation
coefficient between X1 and X2 after
eliminating the linear effect of X3 on X1
and X2 is called the partial correlation
coefficient
r12.3 is the correlation between x1 and x2
after eliminating the effect of x3 on x1 and
X2