0% found this document useful (0 votes)

12 views12 pages

Lecture 3

The document discusses the theory of correlations, focusing on the relationship between two or more variables and the methods to quantify this relationship using correlation coefficients. It explains the definitions of correlation, the significance of correlation coefficients, and the distinction between correlation and causation. Additionally, it outlines methods for correlation analysis, including graphical and numerical approaches, particularly using Spearman's Rank Correlation and Pearson's Coefficient of Correlation.

Uploaded by

harawataona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views12 pages

Lecture 3

Uploaded by

harawataona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

AAE-223: Statistics for Economist 2

Lecture Notes 3

Assa Mulagha-Maganga
Dept of Agricultural and Applied Economics, LUANAR
Department of Mathematical Sciences (Statistics), Chancellor College

Summer 2022

3 Theory of Correlations

3.1 Theory of Correlations

The various statistical techniques demonstrated in the previous chapters have dealt in an-
alyzing data with only one variable. In general, however, natural phenomenon including
economics, agribusiness, health and other fields of studies are concerned of analyzing of two
or more variables; and therefore, it crucially important to make statistical inferences about
the degree of association or relationships and the direction of relationship between variables.
Hence, correlation analysis will help us to quantify the relationship, determine the validity
and reliability of the co-variation or association between two or more random variables, as
well as, help us to make decision on the nature of the paired variables, even may lead us to
identify for possible causality case (Edriss, 2012).
In this lecture, we consider the degree of relationship between variables, which seeks to de-
termine how well a linear or other equation describes or explains the relationship between
variables. If all values of the variables satisfy an equation exactly (lie on the line of the best
fit), we say that the variables are perfectly correlated or that there is perfect correlation be-
tween them. Thus, the circumferences C and radii r of all circles are perfectly correlated since
c = πr2 . If two dice are tossed simultaneously 100 times, there is no relationship between
corresponding points on each die (unless the dice are loaded); that is, they are uncorrelated.
Such variables as the height and weight of individuals would show some correlation.

3.2 Definition of correlation

Correlation coefficient is a measure of association between two variables, and it ranges be-
tween −1 and 1. If the two variables are in perfect linear relationship, the correlation

1
Statistics for Economists 2

coefficient will be either 1 or -1. The sign depends on whether the variables are positively or
negatively related. The correlation coefficient is 0 if there is no linear relationship between
the variables. Two different types of correlation coefficients are in use. One is called the
Pearson product moment correlation coefficient, and the other is called the Spearman rank
correlation coefficient, which is based on the rank relationship between variables.
One visual way to determine if there is correlation between variables is to use a scatter plot.
Scatter plots are similar to line graphs in that they use horizontal and vertical axes to plot
data points. However, they have a very specific purpose. Scatter plots show how much
one variable is affected by another. The relationship between two variables is called their
correlation.
Scatter plots usually consist of a large body of data. The closer the data points come when
plotted to making a straight line, the higher the correlation between the two variables, or
the stronger the relationship. If the data points make a straight line going from the origin
out to high x- and y-values, then the variables are said to have a positive correlation. If the
line goes from a high-value on the y-axis down to a high-value on the x-axis, the variables
have a negative correlation.
It must be emphasized that in every case the computed value of r measures the degree
of the relationship relative to the type of equation that is actually assumed. Thus, if a
linear equation is assumed and correlation coefficient value of near zero is realized, it means
that there is almost nolinear correlation between the variables. However, it does not mean
that there is no correlation at all, since there may actually be a high nonlinear correlation
between the variables. In other words, the correlation coefficient measures the goodness of
fit between (1) the equation actually assumed and (2) the data. Unless otherwise specified,
the term correlation coefficient is used to mean linear correlation coefficient. It should also
be pointed out that a high correlation coefficient (i.e., near 1 or -1) does not necessarily
indicate a direct dependence of the variables. The correlation coefficient is scale free and
therefore its interpretation is independent of the units of measurement of two varibles, say
x and y. In this lecture, the following methods of finding the correlation coefficient between
two variables x and y are discussed:

1. Spearman’s Rank Correlation method

2. Karl Pearson’s Coefficient of Correlation method

3.3 Correlation and Causation

If there is a strong relationship (say, r = 0.91) between two variables, we are tempted to
assume that an increase or decrease in one variable causes a change in the other variable. For
example, it can be shown that the consumption of Malawian peanuts and the consumption
of quinin have a strong correlation. However, this does not indicate that an increase in
the consumption of peanuts caused the consumption of quinin to increase. Likewise, the
incomes of professors and the number of inmates in zomba mental hospital have increased
proportionately. Further, as the population of donkeys has decreased, there has been an

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 2

Statistics for Economists 2

increase in the number of doctoral degrees granted. Relationships such as these are called
nonsense or spurious correlations. What we can conclude when we find two variables with
a strong correlation is that there is a relationship or association between the two variables,
not that a change in one causes a change in the other.

3.4 Methods of Correlation Analysis

3.4.1 The graphical approach

The scatter diagram method is a quick at-a-glance method of determining of an apparent

relationship between two variables, if any. A scatter diagram (or a graph) can be obtained on
a graph paper by plotting observed (or known) pairs of values of variables x and y, taking the
independent variable values on the x-axis and the dependent variable values on the y-axis.
It is common to try to draw a straight line through data points so that an equal number of
points lie on either side of the line. The relationship between two variables x and y described
by the data points is defined by this straight line. The pattern of data points in the diagram
indicates that the variables are related.
35

25
mpg

2 3 4 5
wt

Figure 1: Negative linear relationsip

3.4.2 The Numerical approach

a. Spearman rank correlation coefficient

Spearman’s Rank correlation coefficient is used to identify and test the strength of a rela-
tionship between two sets of data. It is often used as a statistical method to aid with either
proving or disproving a hypothesis e.g. the depth of a river does not progressively increase
the further from the river bank. The formula used to calculate Spearman’s Rank is shown
below.

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 3

Statistics for Economists 2

25
mpg

10
3.0 3.5 4.0 4.5 5.0
drat

Figure 2: Positive linear relationsip

25
mpg

10
16 18 20 22
qsec

Figure 3: No linear relationsip

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 4

Statistics for Economists 2

6 d2
P
r = 1 − −3
n −n

How can the calculation be carried out in Excel?

Once the data has been collected, Excel can be used to calculate and graph Spearman’s
Rank correlation to discover if a relationship exists between the two sets of data, and how
strong this relationship is. Please note this example uses a dataset of 10 samples, but your
dataset should include a minimum of 15 to be valid.

Step 1: Create a table in Excel and enter your data sets.

Sample Width (cm) Width (Rank) Depth (Cm) Depth (Rank)

1 0 0
2 50 10
3 150 28
4 200 42
5 250 59
6 300 51
7 350 73
8 400 85
9 450 104
10 500 96

Step 2: Rank each set of data (width rank and depth rank

Rank 1 will be given to the largest number in column 2. Continue ranking till all widths
have been ranked. Once all the widths have been ranked then do exactly the same for depth.
Sample Width (cm) Width (Rank) Depth (Cm) Depth (Rank)
1 0 10 0 10
2 50 9 10 9
3 150 8 28 8
4 200 7 42 7
5 250 6 59 5
6 300 5 51 6
7 350 4 73 4
8 400 3 85 3
9 450 2 104 1
10 500 1 96 1
If there are two samples with the same value, the mean (average) rank should be used - for
example if there were 3 samples all with the same depth ranked 6th in order you would add

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 5

Statistics for Economists 2

the rank values together (6 + 7 + 8 = 21) then divide this number by the number of samples
with the same depth number, in this case 3 (21/3=7) so they would all receive a rank of 7.
The next greatest depth would be given a value of 9.

Step 3

: The next stage is to find d (the difference in rank between the width and depth). First,
add a new column to your table, and then calculate d by subtracting the depth rank column
(column 5) from the width rank column (column 3). For example, for sample 6 width rank
is 5 and the depth rank is 6 so d = 5 − 6 = −1. To calculate d in Excel, select the cell you
wish to enter the information into and type =. Now click on the width rank cell you want to
use and type -. Finally, click on the depth rank cell and press enter. The value of d should
appear in the first box you selected.
Sample Width (Cm) Width (Rank) Depth (Cm) Depth (Rank) d
1 0 10 0 10 0
2 50 9 10 9 0
3 150 8 28 8 0
4 200 7 42 7 0
5 250 6 59 5 1
6 300 5 51 6 -1
7 350 4 73 4 0
8 400 3 85 3 0
9 450 2 104 1 1
10 500 1 96 1 1

Step 4

: The next step is to calculate d2 . Add another column to your table and label it. To
calculate d2 type in the first cell =POWER(number, power). In this case the number is the
value of d and the power is 2 as we are trying to find the square value e.g. for sample 6 the
value of d is -1 so you would enter into the cell =POWER(-1,2) then press enter and the
value you should get is 1.
Sample Width (Cm) Width (Rank) Depth (Cm) Depth (Rank) d d2
1 0 10 0 10 0
2 50 9 10 9 0
3 150 8 28 8 0
4 200 7 42 7 0
5 250 6 59 5 1 1
6 300 5 51 6 -1 1
7 350 4 73 4 0
8 400 3 85 3 0
9 450 2 104 1 1 1
10 500 1 96 1 1 1

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 6

Statistics for Economists 2

Repeat the same process until all of your samples have a value of d2 . Once all the d2 values
have been calculated add them together to calculate d2 . The quickest way to do this
P

in Excel is to click on the cell underneath your last entry into the d2 column, click on the
P
autosum symbol (which you can find on the tool bar at the top of the page), and press
enter. (Depending on which version of Excel you are using, you may have to select the
column you wish to add together before you press enter.)

Step 5

: Now we have the d2 values, but to complete the equation we still need to calculate n3 − n.
n is the number of samples, so in this case is 10. As in step 4, type into the cell you wish
to use =POWER(number,power) which will give you a value for n3 . Remember this time
that ‘number’ is the number of samples and ‘power’ is 3 as you are cubing not squaring the
value. Once n3 has been calculated, subtract the value of n from it.
Sample Width (Cm) Width (Rank) Depth (cm) Depth (Rank) d d2
1 0 10 0 10 0
2 50 9 10 9 0
3 150 8 28 8 0
4 200 7 42 7 0
5 250 6 59 5 1 1
6 300 5 51 6 -1 1
7 350 4 73 4 0
8 400 3 85 3 0
9 450 2 104 1 1
10 500 1 96 1 1
P 2
d 4
n 10
n3 1000
n3 − n 990

Step 6

: All that is left to do now is to insert the values into the equation to calculate r.

6 d2
P
r =1− 3
n −n
The formula you should enter into a cell in this case would be = 1 − ((6 ∗ 4)/990) (N.B * is
the symbol for multiply)
End result - The result at the end should always be between the value of +1 and −1. In
this case the value is 0.9757 to 4 decimal places, or 0.98 to 2 decimal places.

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 7

Statistics for Economists 2

Implementation of spearman rank correlation in R

R Language provides two methods to calculate the correlation coefficient. By using the
functions cor() or cor.test() it can be calculated. It can be noted that cor() computes the
correlation coefficient whereas cor.test() computes test for association or correlation between
paired samples. It returns both the correlation coefficient and the significance level(or p-
value) of the correlation.

Syntax: cor(x, y, method = "spearman")

Example of Taking two numeric variables of x and y in a dataset called df.

df <- data.frame(x = c(15, 18, 21, 15, 21),

y = c(25, 25, 27, 27, 27))

# Calculating spearman rank Correlation coefficient

result <- cor(df$x, df$y, method = "spearman")

## Spearman correlation coefficient is: 0.4564355

b. Karl Pearson product moment correlation coefficient

In statistics, it’s a measuring tool to determine whether there is a linear relationship between
two variables - or not. It quantifies the strength and the direction of the relationship which
can be identified by the correlation coefficient. A correlation exists when two variables are
measured and when there is a change in one, there is a change in another, whether it’s in
the same or opposite direction. There are other correlation measurement tools like Kendall’s
rank correlation, but those measure different types of associations and aren’t alternatives to
using the Pearson Correlation Coefficient model.
If you were to use the Pearson correlation measurement as an equation it can get pretty
complicated. In definition, the Pearson Product-Moment Correlation is the covariance of
two variables divided by the product of their standard deviations. The equation looks like
this:

(x − x̄)(y − ȳ)
P
r = qP
(x − x̄)2 ( (y − ȳ))
P

Formula can be written in the equivalent form

xy − (
P P P
n y) x)(
r=q P
[n x2 − ( x2 )][n y 2 − ( y 2 )]
P P P

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 8

Statistics for Economists 2

Assumptions of Using Pearson’s Correlation Coefficient i. Pearson’s correlation coefficient is

appropriate to calculate when both variables x and y are measured on an interval or a ratio
scale.

ii. Both variables x and y are normally distributed, and that there is a linear relationship
between these variables.

iii. The correlation coefficient is largely affected due to truncation of the range of values in
one or both of the variables. This occurs when the distributions of both the variables
greatly deviate from the normal shape.

iv. There is a cause and effect relationship between two variables that influences the dis-
tributions of both the variables. Otherwise correlation coefficient might either be
extremely low or even zero.

Advantage and Disadvantages of Pearson’s Correlation Coefficient The correlation coefficient

is a numerical number between – 1 and 1 that summarizes the magnitude as well as direction
(positive or negative) of association between two variables. The chief limitations of Pearson’s
method are:

i. The correlation coefficient always assumes a linear relationship between two variables,
whether it is true or not.

ii. Great care must be exercised in interpreting the value of this coefficient as very often
its value is misinterpreted.

iii. The value of the coefficient is unduly affected by the extreme values of two variable
values.

iv. As compared with other methods the computational time required to calculate the
value of r using Pearson’s method is lengthy.

Example

Find the coefficient of linear correlation between the variables X (number o sales calls) and
Y (number of laptops sold) presented in Table below
X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9

Solution

The work involved in the computation can be organized as in Table below

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 9

Statistics for Economists 2

(x − x̄)(y − ȳ)
P
r= qP
(x − x̄)2 ( (y − ȳ)2 )
P

cov(x, y)
=q
var(x)var(y)
84
=q = 0.977
(132)(56)

This shows that there is a very high linear correlation between the variables
X Y x − x̄ y − ȳ (x − x̄)2 (x − x̄)(y − ȳ) (y − ȳ)2
1 1 -6 -4 36 24 16
3 2 -4 -3 16 12 9
4 4 -3 -1 9 3 1
6 4 -1 -1 1 1 1
8 5 1 0 1 0 0
9 7 2 2 4 4 4
11 8 4 3 16 12 9
14 9 7 4 49 28 16
(x − x̄)2 (x − x̄)(y− (y − ȳ)2
P P P P P
x = 56 y = 40
x̄ = 56/8 = 7 ȳ = 40/8 = 5 =132 ȳ) = 84 =56

Implementation of pearson correlation in R

R Language provides two methods to calculate the pearson correlation coefficient. By using
the functions cor() or cor.test() it can be calculated. It can be noted that cor() computes
the correlation coefficient whereas cor.test() computes the test for association or correlation
between paired samples. It returns both the correlation coefficient and the significance
level(or p-value) of the correlation.

Syntax: cor(x, y, method = "spearman")

Example of Taking two numeric variables of x and y in a dataset called df.

df <- data.frame(x = c(15, 18, 21, 15, 21),

y = c(25, 25, 27, 27, 27))

# Calculating spearman rank Correlation coefficient

result <- cor(df$x, df$y, method = "pearson")

## Pearson correlation coefficient is: 0.4564355

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 10

Statistics for Economists 2

3.5 Probable Error and Standard Error of Coefficient of Correla-

tion
The probable error (PE) of coefficient of correlation indicates extent to which its value
depends on the condition of random sampling. If r is the calculated value of correlation co-
efficient in a sample of n pairs of observations, then the standard error SEr of the correlation
coefficient r is given by
1−r
SEr = √
n

The probable error of the coefficient of correlation is calculated by the expression:

P E = 0.6745SEr
1 − r2
= 0.6745 √
n

Thus with the help of PEr we can determine the range within which population coefficient of
correlation is expected to fall using following formula: ρ = r ± P Er where ρ (rho) represents
population coefficient of correlation.

1. If r < P Er then the value of r is not significant, that is, there is no relationship between
two variables of interest.

2. If r > 6P Er then value of r is significant, that is, there exists a relationship between
two variables.

Example 3.1

(y − ȳ)2 = 90. Find out correlation

P
If covariance of 10 pairs of items is 7, variance of x is 36,
coefficient, r.

Solution 3.1

We know that correlation, r, is given by:

Cov(x, y)
r=
σx σy

Given, Cov(x, y) = 7, n = 10, σx2 = 36, (y − ȳ)2 = 90. Since Var(x) = σx2 = 36, Std. dev
P

(σx = 6.
In addition, sP s
(y − ȳ)2 90
σy = = =3
n 10

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 11

Statistics for Economists 2

Co-relation coefficient, r, will be:

Cov(x, y) 7
r= = = 0.39
σx σy 6×3

Example 3.2

Calculate Karl Pearson’s coefficient of correlation from the following data. Interpret your
result
σx = 10, σy = 12, x̄ = 25, and ȳ = 35
Summation of product of deviation from actual arithmetic means of two series is 24 and
number of observations are 20

Solution 3.2

(x − x̄)(y − ȳ) = 24 and n = 20. Then

P
Given σx = 10, σy = 12, x̄ = 25, and ȳ = 35,

(x − x̄)(y − ȳ)
P
24
Cov(x, y) = = = 1.2
n 10
We know that
Cov(x, y) 1.2
r= = = 0.01
σx σy 10 × 12

Since magnitude of r is very small, correlation between x and y is negligible.

Example 3.3

If r = 0.97 and n = 8, find out the probable error of the coefficient of correlation and
determine the limits for population correlation, r.

Solution 3.3

Given: r = 0.97, n = 8. Then

1 − r2 1 − 0.972 0.6745 × 0.0591

P Er = 0.6745 √ = 0.6745 √ = √ = 0.014
n 8 2.828
Limits of population correlation = r ± P Er = 0.97 ± 0.014 = 0.956 to 0.984

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 12

Hyundai Engine HMC l4kb9 Shop Manual
100% (64)
Hyundai Engine HMC l4kb9 Shop Manual
10 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
23 pages
Theory 1
No ratings yet
Theory 1
11 pages
Correlation and Regression
No ratings yet
Correlation and Regression
71 pages
Econmetrics Chapter 3
No ratings yet
Econmetrics Chapter 3
20 pages
Correlation Analysis - B Statistics
No ratings yet
Correlation Analysis - B Statistics
8 pages
Correlation Analysis
No ratings yet
Correlation Analysis
52 pages
Correlation
No ratings yet
Correlation
46 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Lesson 10 Relationship Between Variables
No ratings yet
Lesson 10 Relationship Between Variables
85 pages
QT - Unit 2 - Part A - Correlation
No ratings yet
QT - Unit 2 - Part A - Correlation
48 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
5-Correlation and Rank Correlation-03!02!2025
No ratings yet
5-Correlation and Rank Correlation-03!02!2025
60 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Module - 2 Correlation Analysis: Contents: 2.2 Types of Correlation
No ratings yet
Module - 2 Correlation Analysis: Contents: 2.2 Types of Correlation
7 pages
Correlation and Regration
No ratings yet
Correlation and Regration
8 pages
Correlation & Regression
No ratings yet
Correlation & Regression
3 pages
Correlation
No ratings yet
Correlation
25 pages
Correlation and Regression
No ratings yet
Correlation and Regression
59 pages
Module 4
No ratings yet
Module 4
95 pages
Mis 121620003
No ratings yet
Mis 121620003
39 pages
MBA LSCM: Correlation & Regression
No ratings yet
MBA LSCM: Correlation & Regression
50 pages
Correlation & Regression
100% (1)
Correlation & Regression
23 pages
Biostatistics Unit 10. Measures of Relationship
No ratings yet
Biostatistics Unit 10. Measures of Relationship
37 pages
Short Term Training Programme On Data Analytics Using SPSS and RCMDR
No ratings yet
Short Term Training Programme On Data Analytics Using SPSS and RCMDR
20 pages
Correlation SBC
No ratings yet
Correlation SBC
4 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
59 pages
Cce 68 D 4 CC 4
No ratings yet
Cce 68 D 4 CC 4
28 pages
Presentation On: Correlation and Rank Correlation: Submitted To
100% (3)
Presentation On: Correlation and Rank Correlation: Submitted To
23 pages
Correlation Analysis Overview
No ratings yet
Correlation Analysis Overview
40 pages
Chapter 8 - PSYC 284
No ratings yet
Chapter 8 - PSYC 284
7 pages
Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm
No ratings yet
Correlation: Khairil Anuar Md. Isa Bbiomedicalsc. (Hons), Ukm Msc. (Medical Stat), Usm
33 pages
Fds Unit III Notes
No ratings yet
Fds Unit III Notes
23 pages
Chapter 3 Stat
No ratings yet
Chapter 3 Stat
66 pages
11 Correlation
No ratings yet
11 Correlation
28 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
Makaku PDF
No ratings yet
Makaku PDF
3 pages
Qt-Bcom-Module Ii-Juraz
No ratings yet
Qt-Bcom-Module Ii-Juraz
6 pages
r23 P & S Unit 2 Material
No ratings yet
r23 P & S Unit 2 Material
14 pages
Correlation Analysis
No ratings yet
Correlation Analysis
30 pages
Correlation Coefficient in Medical Research
No ratings yet
Correlation Coefficient in Medical Research
6 pages
Correlation
No ratings yet
Correlation
2 pages
Methods of Studying Correlation
No ratings yet
Methods of Studying Correlation
17 pages
Correlation
No ratings yet
Correlation
18 pages
Correlation
No ratings yet
Correlation
19 pages
Group Assignment
No ratings yet
Group Assignment
3 pages
Correlation 26-2-24
No ratings yet
Correlation 26-2-24
16 pages
WINSEM2023 WINSEM2023 24 - BITE304L - TH - VL2023240503885 - DA 1 QP - KEY 01 22 - Reference Material I
No ratings yet
WINSEM2023 WINSEM2023 24 - BITE304L - TH - VL2023240503885 - DA 1 QP - KEY 01 22 - Reference Material I
57 pages
Correlation 1
No ratings yet
Correlation 1
9 pages
Correlation
No ratings yet
Correlation
54 pages
Correlation: Some Commonly Used Jargons
No ratings yet
Correlation: Some Commonly Used Jargons
19 pages
Correlation
No ratings yet
Correlation
8 pages
FODS Unit-3
No ratings yet
FODS Unit-3
25 pages
Topic 2
No ratings yet
Topic 2
25 pages
Correlation Analysis Guide
No ratings yet
Correlation Analysis Guide
12 pages
Correlation Constant
No ratings yet
Correlation Constant
23 pages
Correlation and Regression - Interview Questions in Business Analytics
No ratings yet
Correlation and Regression - Interview Questions in Business Analytics
5 pages
Peter
No ratings yet
Peter
48 pages
Correlation Concepts & Methods
No ratings yet
Correlation Concepts & Methods
43 pages
Entrepreneurship ABM
No ratings yet
Entrepreneurship ABM
8 pages
Entrepreneurship Lecture 6 Feasibility - Analysis
No ratings yet
Entrepreneurship Lecture 6 Feasibility - Analysis
33 pages
Entrepreneurship
No ratings yet
Entrepreneurship
8 pages
Entrepreneurship Lecture 4
No ratings yet
Entrepreneurship Lecture 4
21 pages
Lecture 4
No ratings yet
Lecture 4
11 pages
Entrepreneurship Lecture 5
No ratings yet
Entrepreneurship Lecture 5
19 pages
Lecture 2
No ratings yet
Lecture 2
13 pages
Entrepreneurship Lecture 1
No ratings yet
Entrepreneurship Lecture 1
29 pages
Finale Dod Program Sessions.22
No ratings yet
Finale Dod Program Sessions.22
1 page
Statistics For Econominists II (AAE32202) - Course Outline
No ratings yet
Statistics For Econominists II (AAE32202) - Course Outline
3 pages
AAE32201 Lesson 1
No ratings yet
AAE32201 Lesson 1
23 pages
Vapplication For The Position of Disease Control Surveillance. Va
No ratings yet
Vapplication For The Position of Disease Control Surveillance. Va
5 pages
AAE32201 Lesson 2.1
No ratings yet
AAE32201 Lesson 2.1
36 pages
AAE32201 Lesson 4
No ratings yet
AAE32201 Lesson 4
20 pages
Dialogue Completion & Reading Comprehension
0% (1)
Dialogue Completion & Reading Comprehension
8 pages
Bahrick Et Al. (1993) Spacing Effect
No ratings yet
Bahrick Et Al. (1993) Spacing Effect
7 pages
FBRE 07 Fletcher Reo - Mesh Guide North Island V10.00.0322 MR
No ratings yet
FBRE 07 Fletcher Reo - Mesh Guide North Island V10.00.0322 MR
20 pages
Improvement of Supply Chain Performance of Printin
No ratings yet
Improvement of Supply Chain Performance of Printin
12 pages
Business Plan Group 3
100% (1)
Business Plan Group 3
12 pages
Wind Meter App for Enthusiasts
No ratings yet
Wind Meter App for Enthusiasts
9 pages
OBG Latest Drug
No ratings yet
OBG Latest Drug
71 pages
CHOCOLAT (1988) Analysis of The Film
No ratings yet
CHOCOLAT (1988) Analysis of The Film
3 pages
Noting and Drafting Skills
100% (2)
Noting and Drafting Skills
33 pages
MyEdBC Family Portal Instructional Manual
No ratings yet
MyEdBC Family Portal Instructional Manual
6 pages
HR Interview Questions
No ratings yet
HR Interview Questions
8 pages
Invariant Variation Problems: 1. Preliminary Remarks and Formulation of Theorems
No ratings yet
Invariant Variation Problems: 1. Preliminary Remarks and Formulation of Theorems
14 pages
Visitors Guide. Motril History Museum
No ratings yet
Visitors Guide. Motril History Museum
24 pages
It Ix Sa1 Sample Paper
No ratings yet
It Ix Sa1 Sample Paper
3 pages
Unit 2 - Esp in Elt - Complete
No ratings yet
Unit 2 - Esp in Elt - Complete
35 pages
Mitosis Lecture PDF
No ratings yet
Mitosis Lecture PDF
11 pages
A212 - MC 10 - PROVISIONS, CLCA - Student
No ratings yet
A212 - MC 10 - PROVISIONS, CLCA - Student
4 pages
9 - Class INTSO Work Sheet - 3 - Basic Concepts of Geometry
No ratings yet
9 - Class INTSO Work Sheet - 3 - Basic Concepts of Geometry
8 pages
Chi Square Test
No ratings yet
Chi Square Test
11 pages
Prolegomenon To Geisha As A Cultural Performer: Miyako Odori, The Gion School and Representation of A Traditional" Japan - Mariko Okada
No ratings yet
Prolegomenon To Geisha As A Cultural Performer: Miyako Odori, The Gion School and Representation of A Traditional" Japan - Mariko Okada
7 pages
Human Resource Management System (HRMS) : Department of Personnel
No ratings yet
Human Resource Management System (HRMS) : Department of Personnel
19 pages
Obj. & Scope
No ratings yet
Obj. & Scope
2 pages
STEP 7 V56 - Compatibility List
No ratings yet
STEP 7 V56 - Compatibility List
31 pages
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
No ratings yet
Instruction Manual: P/N 30-2131-XXX Pressure Sensors
2 pages
Adani Group Acquires NDTV Assingment No. 1
No ratings yet
Adani Group Acquires NDTV Assingment No. 1
11 pages
Reviewer On Police Photography by Mr. Herbert Tunac, RMT, MSMT
No ratings yet
Reviewer On Police Photography by Mr. Herbert Tunac, RMT, MSMT
4 pages
Szymanowski List of Compositions
No ratings yet
Szymanowski List of Compositions
12 pages
RPMS COT Sheets
No ratings yet
RPMS COT Sheets
12 pages
M3JP M3KP M3HP M3GP 2020
No ratings yet
M3JP M3KP M3HP M3GP 2020
252 pages

Lecture 3

Uploaded by

Lecture 3

Uploaded by

AAE-223: Statistics for Economist 2

3.1 Theory of Correlations

3.2 Definition of correlation

1. Spearman’s Rank Correlation method

2. Karl Pearson’s Coefficient of Correlation method

3.3 Correlation and Causation

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 2

3.4 Methods of Correlation Analysis

The scatter diagram method is a quick at-a-glance method of determining of an apparent

Figure 1: Negative linear relationsip

3.4.2 The Numerical approach

a. Spearman rank correlation coefficient

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 3

Figure 2: Positive linear relationsip

Figure 3: No linear relationsip

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 4

How can the calculation be carried out in Excel?

Step 1: Create a table in Excel and enter your data sets.

Sample Width (cm) Width (Rank) Depth (Cm) Depth (Rank)

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 5

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 6

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 7

Implementation of spearman rank correlation in R

Syntax: cor(x, y, method = "spearman")

Example of Taking two numeric variables of x and y in a dataset called df.

df <- data.frame(x = c(15, 18, 21, 15, 21),

# Calculating spearman rank Correlation coefficient

result <- cor(df$x, df$y, method = "spearman")

## Spearman correlation coefficient is: 0.4564355

b. Karl Pearson product moment correlation coefficient

Formula can be written in the equivalent form

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 8

Assumptions of Using Pearson’s Correlation Coefficient i. Pearson’s correlation coefficient is

Advantage and Disadvantages of Pearson’s Correlation Coefficient The correlation coefficient

The work involved in the computation can be organized as in Table below

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 9

Implementation of pearson correlation in R

Syntax: cor(x, y, method = "spearman")

Example of Taking two numeric variables of x and y in a dataset called df.

df <- data.frame(x = c(15, 18, 21, 15, 21),

# Calculating spearman rank Correlation coefficient

result <- cor(df$x, df$y, method = "pearson")

## Pearson correlation coefficient is: 0.4564355

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 10

3.5 Probable Error and Standard Error of Coefficient of Correla-

The probable error of the coefficient of correlation is calculated by the expression:

(y − ȳ)2 = 90. Find out correlation

We know that correlation, r, is given by:

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 11

Co-relation coefficient, r, will be:

(x − x̄)(y − ȳ) = 24 and n = 20. Then

Since magnitude of r is very small, correlation between x and y is negligible.

Given: r = 0.97, n = 8. Then

1 − r2 1 − 0.972 0.6745 × 0.0591

By Assa Mulagha-Maganga, Lilongwe University of Agriculture and Natural Resources 12

You might also like