Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views23 pages

Unit 4

The document outlines Unit 4 of Mathematics III at Silver Oak University, focusing on Correlation and Regression, covering key concepts such as types of correlation, correlation coefficients, and methods of studying correlation. It explains various types of correlation including positive, negative, simple, multiple, partial, and total correlation, along with graphical methods like scatter diagrams and mathematical models like Karl Pearson's coefficient. Additionally, it provides examples and formulas for calculating correlation coefficients between variables.

Uploaded by

bdj210631
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views23 pages

Unit 4

The document outlines Unit 4 of Mathematics III at Silver Oak University, focusing on Correlation and Regression, covering key concepts such as types of correlation, correlation coefficients, and methods of studying correlation. It explains various types of correlation including positive, negative, simple, multiple, partial, and total correlation, along with graphical methods like scatter diagrams and mathematical models like Karl Pearson's coefficient. Additionally, it provides examples and formulas for calculating correlation coefficients between variables.

Uploaded by

bdj210631
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Silver Oak University- Degree Engineering

Aditya Silver Oak Institute of Technology


Sem: 3rd
Sub: Mathematics III
Faculty: Asst. Prof. Poonam Kumari-ASOIT
Unit 4: Correlation and Regression
Weightage: 18%

Topics:

• Correlation
• Properties of correlation
• Types of correlation
• Correlation coefficient
• Rank correlation
• Regression
• Regression coefficient
• Properties of regression coefficient
• Expressions of regression coefficient

1. Correlation

Correlation is the relationship between two or more variables. Two variables are said to be
correlated if a change in one variable affects a change in the other variable. A data that
connects two variables is called bivariate data. Thus, correlation is a statistical analysis which
measures and analyses the degree or extent to which two variables fluctuate with reference
to each other. For example: relation between price and demand of commodity, relation
between rainfall and yield of crops

2. Types of correlation

Correlation is classified into four types:

1. Positive and negative correlation


2. Simple and multiple correlation
3. Partial and total correlation
4. Linear and nonlinear correlation

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 1
1. Positive and negative correlation

Correlation is depending on the variation in the variables, which decides whether it may be
positive or negative.

i. Positive correlation

If both the variables vary in the same direction, the correlation is said to be positive.
In other words, if the value of one variable increases, the value of the othe r variable also
increases, or, if value of one variable decreases, the value of the other variable decreases,
e.g., the correlation between heights and weights of group of persons is a positive correlation.

Height (cm) 150 152 155 160 162 165

Weight (cm) 60 62 64 65 67 69

ii. Negative correlation

If both the variables vary in the opposite direction, the correlation is said to be
negative. In other words, if the value of one variable increases, the value of the other variable
also decreases, or, if value of one variable decreases, the value of the other variable increases,
e.g., the correlation between the price and demand of a commodity is a negative correlation.

Price (₹ per unit) 10 8 6 5 4 1

Demand (units) 100 200 300 400 500 600

2. Simple and multiple correlations

Correlation is depending on the study of the number of variables, which decides whether it
may be simple or multiple.

i. Simple correlation

When only two variables are studied, the relation is described as simple correlation,
e.g., the quantity of money and price level, demand and price, etc.

ii. Multiple correlation

When more than two variables are studied, the relationship is described as multiple
correlation, e.g., relationship of price, demand, and supply of a commodity.
Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 2
3. Partial and total correlation

Multiple correlation may be either partial or total

i. Partial correlation

When more than two variables are studied excluding some other variables, the
relationship is termed as partial correlation.

ii. Total correlation

When more than two variables are studied without excluding any variables, the
relationship is termed total correlation.

4. Linear and nonlinear correlations

Depending upon the ratio of change between two variables, the correlation may be linear or
nonlinear.

i. Linear correlation

If the ratio of change between two variables is constant, the correlation is said to be
linear. If such variables are plotted on a graph paper, a straight line is obtained, e.g.,

Milk (l) 5 10 15 20 25 30

Curd (kg) 2 4 6 8 10 12

ii. Nonlinear correlation

If the ratio of change between two variables is not constant, the correlation is said to
be nonlinear. The graph of a nonlinear or curvilinear relationship will be a curve, e.g.,

Advertising expenses (₹ in lacs) 3 6 9 12 15

Sales (₹ in lacs) 10 12 15 15 16

3. Methods of studying correlation

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 3
There are two different methods of studying correlation

Studying
correlation

Graphic Mathematical
methods models

Karl Pearson's Spearman's rank


Scatter diagram Simple graph coefficient of coefficient of
correlation correlation

4. Scatter diagram

The scatter diagram is a diagrammatic representation of bivariate data to find the correlation
between two variables. There are various correlationships between two variables represented
by the following scatter diagrams.

1. Perfect positive correlation

If all the plotted points lie on a straight line rising from the lower left-hand corner to the
upper right-hand corner, the correlation is said to be perfectly positive.

2. Perfect negative correlation

If all the plotted points lie on a straight line falling from the upper left-hand corner to the
lower right-hand corner, the correlation is said to be perfectly negative.

3. High degree of positive correlation

If all the plotted points lie in the narrow strip, rising from the lower left-hand corner to the
upper right-hand corner, it indicates a high degree of positive correlation.

4. High degree of negative correlation

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 4
If all the plotted points lie in the narrow strip, falling from the upper left-hand corner to
the lower right-hand corner, it indicates a high degree of negative correlation.

5. No correlation

If all the plotted points lie on a straight line parallel to the x-axis or y-axis or in a haphazard
manner, it indicates the absence of any relationship between the variables.

Merits of a scatter diagram

1. It is simple and nonmathematical method to find out the correlation between the
variables.
2. It gives an indication of the degree of linear correlation between the variables.
3. It is easy to understand.
4. It is not influenced by the size of extreme items.

5. Simple graph

A simple graph is a diagrammatic representation of bivariate data to find the correlation


between two variables. The values of the two variables are plotted on a graph paper. Two
curves are obtained, one for the variable x and the other for variable y. If both the curves
move in same direction, the correlation is said to be positive. If both the curves move in
opposite direction, the correlation is said to be negative. This method is used in the case of a
time series.

6. Karl Pearson’s coefficient of correlation

The coefficient of correlation is the measure of correlation between two random variables X
and Y, and is denoted by r.

cov ( X , Y )
r=
 XY

where cov ( X , Y ) is covariance of variable X and Y,  X is the standard deviation of variable


X, and  Y is the standard deviation of variable Y.

this expression is known as Karl Pearson’s coefficient of correlation or Karl Pearson’s


product-moment coefficient of corelation.

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 5
1
cov ( X , Y ) =
n
 ( x − x )( y − y )
(x − x )
2

X =
n

( y − y )
2

Y =
n

r=
 ( x − x )( y − y )
(x − x ) ( y − y )
2 2

Above expression can be further modified as

 xy −  n
x y
r=
( x) ( y)
2 2

x 2

n
y 2

n

7. Properties of coefficient of correlation

i. The coefficient of correlation lies between −1 and 1 , i.e., −1  r  1 .


ii. Correlation coefficient is independent of change of origin and change of scale, i.e.,
rxy = rd x d y where

 d d −  n
d d x y
x y
Hence, r =
( d ) ( d )
2 2

d d − −
2 x 2 y
x y
n n
iii. Two independent variables are uncorrelated, i.e., r = 0 .
The converse of above property is not true, i.e., two uncorrelated variables may not
be independent.

Example 1: Calculate the correlation coefficient between x and y using the following data:

x 2 4 5 6 8 11

y 18 12 10 8 7 5

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 6
Solution: n = 6

x y x2 y2 xy

2 18 4 324 36

4 12 16 144 48

5 10 25 100 50

6 8 36 64 48

8 7 64 49 56

11 5 121 25 55

 x = 36  y = 60 x 2
= 266 y 2
= 706  xy = 293

 xy −  n
x y
r=
( x) ( y)
2 2

x 2

n
y 2

n

293 −
( 36 )( 60 )
= 6
( 36 ) ( 60 )
2 2

266 − 706 −
6 6
293 − 360
=
( 7.0711)(10.2956 )
−67
=
72.8012
= −0.9203

Example: Calculate the correlation coefficient between for the following values of demand
and the corresponding price of a commodity:

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 7
Demand in Quintals 65 66 67 67 68 69 70 72

Price in rupees per kg 67 68 65 68 72 72 69 71

Solution: Let the demand in quintal be denoted by x and the price in rupees per kg be denoted
by y.

n=8

x=
 x = 544 = 68
n 8

y=
 y = 552 = 69
n 8

x y x−x y− y (x − x)
2
(y − y)
2
( x − x )( y − y )

65 67 -3 -2 9 4 6

66 68 -2 -1 4 1 2

67 65 -1 -4 1 16 4

67 68 -1 -1 1 1 1

68 72 0 3 0 9 0

69 72 1 3 1 9 3

70 69 2 0 4 0 0

72 71 4 2 16 4 8

x  y ( x − x ) ( y − y ) ( x − x ) ( y − y )  ( x − x )( y − y )
2 2

= 544 = 552 =0 =0 = 36 = 44 = 24

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 8
r=
 ( x − x )( y − y )
(x − x ) ( y − y )
2 2

24
=
36 44
24
=
( 6 )( 6.6332 )
= 0.6030

Example : Calculate the correlation coefficient between x and y using the following data:

x 17 19 21 26 20 28 26 27

y 23 27 25 26 27 25 30 33

Solution:

Let 𝑎 = 23 and 𝑏 = 27 be the assumed means for 𝑥 𝑎𝑛𝑑 𝑦

𝑑𝑥 = 𝑥 − 𝑎 = 𝑥 − 23,

𝑑𝑦 = 𝑦 − 𝑏 = 𝑦 − 27,

𝑛=8

x y 𝒅𝒙 𝒅𝒚 𝒅𝟐𝒙 𝒅𝟐𝒚 𝒅𝒙 𝒅𝒚

17 23 -6 -4 36 16 24

19 27 -4 0 16 0 0

21 25 -2 -2 4 4 4

26 26 3 -1 9 1 -3

20 27 -3 0 9 0 0

28 25 5 -2 25 4 10

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 9
26 30 3 3 9 9 9

27 33 4 6 16 36 24

∑ 𝒅𝒚=0
∑ 𝒅𝒙 ∑ 𝒅𝟐𝒙 ∑ 𝒅𝟐𝒚 ∑ 𝒅𝒙 𝒅𝒚
=𝟎 = 𝟏𝟐𝟒 = 𝟕𝟎 = 𝟒𝟖

 d d −  n
d d x y
x y
r=
( d ) ( d )
2 2

d − d −
2 x 2 y
x y
n n

48−0
=
√120−0√70−0

𝑟 = 0.515

Exercise

Exercise 1:Calculate the correlation coefficient between x and y using the following data:

x 10 12 18 24 23 27

y 13 18 12 25 30 10

Ans:0.223

Exercise 2:Calculate the correlation coefficient between x and y using the following data:

x 62 64 65 69 70 71 72 74

y 126 125 139 145 165 152 180 208

Ans:0.9032
Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 10
RANK CORRELATION

The orders corresponding to two characteristics A and B, the corresponding between these
n pairs of ranks is called the rank correlation.

1.Spearman’s Rank Correlation Coefficients

It is defined as

6 ∑ 𝑑2
𝑟 =1−
𝑛 ( 𝑛2−1)

Example1: Ten participants in a contest are ranked by two judges as follows:

x 1 3 7 5 4 6 2 10 9 8

y 3 1 4 5 6 9 7 8 10 2

Calculate the rank correlation coefficient.

Solution: 𝑛 = 10

Rank by first judge Rank by second 𝑑=𝑥−𝑦 𝑑2


x judge y

1 3 -2 4

3 1 2 4

7 4 3 9

5 5 0 0

4 6 -2 4

6 9 -3 9

2 7 -5 25

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 11
10 8 2 4

9 10 -1 1

8 2 6 36

∑𝑑 = 0 ∑ 𝑑 2 = 96

6 ∑ 𝑑2
𝑟 = 1−
𝑛( 𝑛2−1)

6 ( 96)
𝑟 = 1− = 0.418
10[( 10) 2−1]

2.Tied Rank

If there is a tie between two or more individuals ranks, the rank is divided among equal
individuals, e.g., if two items have fourth rank, the 4 th and 5 th rank is divided between them
4+5
equally and is given as = 4.5th rank to each of them. If three items have the same 4 th
2
4+5+6
rank, each of them is given = 5th rank. As a result of this the following adjustment or
3
correlation is made in the rank correlation formula is defined as

(𝑚13 − 𝑚1 ) (𝑚32 − 𝑚2 )
6 [∑ 𝑑 2 + + ]+ ⋯
12 12
𝑟 =1−
𝑛(𝑛2 − 1)

Example1: Obtain the rank correlation from the following data:

x 10 12 18 18 15 40

y 12 18 25 25 50 25

Solution: 𝑛 = 6

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 12
x y Rank of x Rank of y d = x-y 𝑑2

10 12 1 1 0 0

12 18 2 2 0 0

18 25 4.5 4 0.5 0.25

18 25 4.5 4 0.5 0.25

15 50 3 6 -3 9

40 25 6 4 2 4

∑ 𝑑2
= 13.5

There are two items in the x series having equal values at the rank 4. Each is given the rank
4.5. Similarly, there are three items in the y series at the rank 3. Each of them is given the
rank 4.

𝑚1 = 2, 𝑚2 = 3
(𝑚13 − 𝑚1 ) (𝑚32 − 𝑚2 )
6 [∑ 𝑑 2 + + ]+ ⋯
12 12
𝑟 =1−
𝑛(𝑛2 − 1)

(8 − 2) (27 − 3)
6 [13.50 + + ]
12 12
𝑟 = 1− = 0.5429.
6((6) 2 − 1)

Homework examples:

Example1: Compute Spearman’s rank correlations coefficient from the following data:

x 18 20 34 52 12

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 13
y 39 23 35 18 46

Ans: - 0.9

Example 2: Compute Spearman’s rank correlations coefficient from the following data
which of judges has the nearest approach to common liking in voice.

Judge x 6 10 2 9 8 1 5 3 4 7

Judge y 5 4 10 1 9 3 8 7 2 6

Judge z 4 8 2 10 7 6 9 1 3 6

Ans: judge x and judge z.

Examples 3:Calculate the rank correlation coefficient between the scoreIQ.

Score 35 40 25 55 85 90 65 55 45 50

IQ 100 100 110 140 150 130 100 120 140 110

Ans: 0.47

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 14
Regression
Regression is defined as a method of estimating the value of one variable when that of the
other is known and the variables are correlated. Regression analysis is used to predict or
estimate one variable in terms of the other variables. It is highly variable tool for prediction
purpose in economics and business.

Types of Regression

Regression is classified into two types:

1. Simple and multiple regression


2. Linear and nonlinear regression.

1.Simple and multiple correlations

Regression is depending on the study of the number of variables, which decides whether it
may be simple or multiple.

I. Simple correlation

The regression analysis for studying only two variables at a time, the relation is
described as simple regression.

II. Multiple correlation

The regression analysis for studying more than two variables at a time known as
multiple regression.

2. Linear and nonlinear regressions

Depending upon the regression curve, regression may be linear or nonlinear

I. Linear Regression: If the regression curve is a straight line, the regression is said
to be linear
II. Nonlinear Regression: If the regression curve is not a straight line i.e. not a first
degree equation in the variables x and y, the regression is said to be nonlinear or
curvilinear.

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 15
Methods of studying Regression
1. Method of scatter Diagram
It is the simplest method of obtaining the line of regression. The data are plotted on a
graph paper by taking the independent variable on the x axis and the dependent
variable on y axis. Each of these points are generally scattered in a narrow strip. If
the correlation is perfect i.e., if r is equal to one, positive, or negative, the points will
lie on a line which is the line of regression.
2. Method of Least square
It is used for obtaining the equation of a curve which fits best to a given set of
observations. It is based on the assumption that the sum of squares of differences
between the estimated values and the actual observed values of the observations is
minimum.

Line of Regression
If all the points in the scatter diagram cluster around a straight line, the line is called
line of regression. The line of regression is the line of best fit and is obtained by the
principle of least squares.

Line of regression of 𝑦 𝑜𝑛 𝑥

It is the line which gives the best estimate for the values of 𝑦 for any given values of
𝑥. The regression equation of 𝑦 𝑜𝑛 𝑥 is given by

𝜎𝑦
𝑦 − 𝑦̅ = 𝑟 (𝑥 − 𝑥̅)
𝜎𝑥
It is also written as
𝑦 = 𝑎 + 𝑏𝑥

Line of regression of 𝑥 𝑜𝑛 𝑦

It is the line which gives the best estimate for the values of 𝑥 for any given values of
𝑦. The regression equation of 𝑥 𝑜𝑛 𝑦 is given by

𝜎𝑥
𝑥 − 𝑥̅ = 𝑟 (𝑦 − 𝑦̅)
𝜎𝑦
It is also written as
Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 16
𝑥 = 𝑎 + 𝑏𝑦

Where 𝑥̅ 𝑎𝑛𝑑 𝑦̅ are means of 𝑥 series and 𝑦 series respectively, 𝜎𝑥 𝑎𝑛𝑑 𝜎𝑦 are
standard deviation of 𝑥 series and 𝑦 series respectively, 𝑟
is the correlation coefficient between x and y.

Properties of Lines of Regression (Line of Regression)


1. The two regression lines 𝑥 𝑜𝑛 𝑦 and 𝑦 𝑜𝑛 𝑥 always intersect at their means (𝑥̅, 𝑦̅).
2. Since 𝑟 2 = 𝑏𝑦𝑥 𝑏𝑥𝑦, i.e., 𝑟 = √𝑏𝑦𝑥 𝑏𝑥𝑦 , therefore 𝑟, 𝑏𝑦𝑥 , 𝑏𝑥𝑦 all have the same sign.
3. If 𝑟 = 0, the regression coefficient are zero.
4. The regression lines become identical if 𝑟 = ±1. If 𝑟 = 0 these lines are
perpendicular to each other.

Regression Coefficients

The slope of the line of regression of 𝑦 𝑜𝑛 𝑥 is called the coefficient of regression of


𝑦 𝑜𝑛 𝑥.

𝑏𝑦𝑥 = Regression coefficient of 𝑦 𝑜𝑛 𝑥


𝜎
= 𝑟 𝜎𝑦
𝑥

Similarly, The slope of the line of regression of 𝑥 𝑜𝑛 𝑦 is called the coefficient of


regression of 𝑥 𝑜𝑛 𝑦.

𝑏𝑦𝑥 = Regression coefficient of 𝑥 𝑜𝑛 𝑦


𝜎𝑥
=𝑟
𝜎𝑦

Properties of Regression Coefficients


1. The coefficient of correlation is the geometric mean of the coefficients of regression,
i.e., 𝑟 = √ 𝑏𝑦𝑥 𝑏𝑥𝑦 .
2. If one of the regression coefficient is greater than one ,the other must be less than
one.
3. The arithmetic mean of regression coefficients is greater than or equal to the
coefficient of correlation.
Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 17
4. Regression coefficients are independent of the change of origin but not of scale.
5. Both regression coefficient will have the same sign, i.e., either both are positive or
both are negative.
6. The sign of correlation is same as that of the regression coefficient, i.e.,𝑟 > 0 if
𝑏𝑥𝑦 > 0 𝑎𝑛𝑑 𝑏𝑦𝑥 > 0 and 𝑟 < 0 if 𝑏𝑥𝑦 < 0 𝑎𝑛𝑑 𝑏𝑦𝑥 < 0.

Expressions for Regression Coefficients


∑(𝑥−𝑥̅ )(𝑦−𝑦̅)
(i) 𝑏𝑦𝑥 = ∑(𝑥−𝑥̅ )2

∑(𝑥−𝑥̅ )(𝑦−𝑦̅ )
and 𝑏𝑥𝑦 = ∑(𝑦−𝑦̅) 2

∑𝑥 ∑ 𝑦
∑ 𝑥𝑦−
(ii) 𝑏𝑦𝑥 = 𝑛
(∑ 𝑥) 2
∑ 𝑥2 −
𝑛

∑𝑥 ∑ 𝑦
∑ 𝑥𝑦−
and 𝑏𝑥𝑦= 𝑛
(∑ 𝑦)2
∑ 𝑦2 −
𝑛

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
(iii) 𝑏𝑦𝑥 = 𝑛
(∑ 𝑑𝑥) 2
√∑ 𝑑2𝑥 −
𝑛

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
and 𝑏𝑥𝑦 = 𝑛
2
√∑ 𝑑2 −(∑ 𝑑𝑦)
𝑦 𝑛

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 18
Examples:1 Find the regression coefficients 𝑏𝑦𝑥 𝑎𝑛𝑑 𝑏𝑥𝑦 ,find the correlation coefficient
between 𝑥 𝑎𝑛𝑑 𝑦 for the following data:

x 4 2 3 4 2

y 2 3 2 4 4

Solution: n=5

𝑥 𝑦 𝑥2 𝑦2 𝑥𝑦

4 2 16 4 8

2 3 4 9 6

3 2 9 4 6

4 4 16 16 16

2 4 4 16 8

∑ 𝑥 = 15 ∑ 𝑦 = 15 ∑ 𝑥 2 = 49 ∑ 𝑦 2 = 49 ∑ 𝑥𝑦 = 44

∑𝑥 ∑ 𝑦
∑ 𝑥𝑦−
𝑏𝑦𝑥 = 2
𝑛
(∑ 𝑥) 2 = - 0.25
∑𝑥 −
𝑛

∑𝑥 ∑ 𝑦
∑ 𝑥𝑦−
and 𝑏𝑥𝑦= 𝑛
(∑ 𝑦)2 = -0.25
∑ 𝑦2 −
𝑛

𝑟 = √𝑏𝑦𝑥 𝑏𝑥𝑦 .= √(−025)( −0.25) = 0.25


Since 𝑏𝑦𝑥 and 𝑏𝑥𝑦 are negative, 𝑟 is negative.
𝑟 = −0.25

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 19
Examples:2 Find the two regression from the following data and hence find the
correlation coefficient.

x 6 2 10 4 8

y 9 11 5 8 7

Solution: n=5
∑𝑥 30 ∑ 𝑦 40
𝑥̅ = = = 6 , 𝑦̅ = = =8
𝑛 5 𝑛 5

𝒙 𝒚 (𝒙 − 𝒙
̅) (𝒚 − 𝒚
̅) ̅) 𝟐
(𝒙 − 𝒙 ̅) 𝟐
(𝒙 − 𝒙
∑(𝒙
−𝒙
̅ )(𝒚 − 𝒚
̅)

6 9 0 1 0 1 0

2 11 -4 3 16 9 -12

10 5 4 -3 16 9 -12

4 8 -2 0 4 0 0

8 7 2 -1 4 1 -2

∑(𝑥 − 𝑥̅ )=0 ∑(𝑦 − 𝑦̅)


∑𝑥 = 30 ∑𝑦 = 40 ∑(𝑥 − 𝑥̅ ) 2 ∑(𝑥 − 𝑥̅ ) 2 ∑(𝑥
=0 − 𝑥̅ )(𝑦 − 𝑦̅)
=40 =20
=-26

∑(𝑥−𝑥̅ )(𝑦−𝑦̅)
(i) 𝑏𝑦𝑥 = ∑(𝑥−𝑥̅ )2
= −0.65

∑(𝑥−𝑥̅ )(𝑦−𝑦̅ )
and 𝑏𝑥𝑦 = ∑(𝑦−𝑦̅) 2
= −1.3
The equation of regression line of 𝑥 𝑜𝑛 𝑦 is
𝜎𝑥
𝑥 − 𝑥̅ = 𝑟 (𝑦 − 𝑦̅)
𝜎𝑦

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 20
𝑥 − 6 = 1.3(𝒚 − 𝟖)
𝑥 = −1.3𝑦 + 16.4
The equation of regression line of 𝑦 𝑜𝑛 𝑥 is
𝜎𝑦
𝑦 − 𝑦̅ = 𝑟 (𝑥 − 𝑥̅)
𝜎𝑥
𝑦 − 8 = −0.65 (𝑥 − 6)

𝑦 = −0.65𝑥 + 11.9

𝑟 = √𝑏𝑦𝑥 𝑏𝑥𝑦 .= √(−0.65)( −1.3) = 0.9192


Since 𝑏𝑦𝑥 and 𝑏𝑥𝑦 are negative, 𝑟 is negative.
𝑟 = −0.9192 .

Examples:3 Find the two regression from the following data and hence find the
correlation coefficient.

Sales(x) 100 98 78 85 110 93 80

Purchase(y) 85 90 70 72 95 81 74

Solution: Let 𝑎 = 93 𝑎𝑛𝑑 𝑏 = 81, be the assumed mean of x and y respectively

𝑑𝑥 = 𝑥 − 𝑎 = 𝑥 − 93,

𝑑𝑦 = 𝑦 − 𝑏 = 𝑦 − 91

𝑛=7

𝒙 𝒚 𝒅𝒙 𝒅𝒚 𝒅𝟐𝒙 𝒅𝟐𝒚 𝒅𝒙 𝒅𝒚

100 85 7 4 49 16 28

98 90 5 9 25 81 45

78 70 -15 -11 225 121 165

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 21
85 72 -8 -9 64 81 72

110 95 17 14 289 196 238

93 81 0 0 0 0 0

80 74 -13 -7 169 49 91

∑𝒙 ∑𝒚 ∑𝒅𝒙 ∑𝒅𝒚 = 𝟎 ∑ 𝒅𝟐𝒙 ∑ 𝒅𝟐𝒚 ∑ 𝒅𝒙 𝒅𝒚


= 𝟔𝟒𝟒 = 𝟓𝟔𝟕 = −𝟕 = 𝟖𝟐𝟏 = 𝟓𝟒𝟒 = 𝟔𝟑𝟗

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
𝑏𝑦𝑥 = 𝑛
(∑ 𝑑𝑥) 2
=0.785
2
√∑ 𝑑𝑥 −
𝑛

∑ 𝑑𝑥 ∑ 𝑑𝑦
∑ 𝑑𝑥 𝑑𝑦 −
and 𝑏𝑥𝑦 = 𝑛
= 1.1746
2
(∑ 𝑑𝑦)
√∑ 𝑑2 −
𝑦 𝑛

The equation of regression line of 𝑥 𝑜𝑛 𝑦 is


𝜎𝑥
𝑥 − 𝑥̅ = 𝑟 (𝑦 − 𝑦̅)
𝜎𝑦

𝑥 − 92 = 1.1746( 𝒚 − 𝟖𝟏)
𝑥 = 1.1746𝑦 + 3.1426
The equation of regression line of 𝑦 𝑜𝑛 𝑥 is
𝜎𝑦
𝑦 − 𝑦̅ = 𝑟 (𝑥 − 𝑥̅)
𝜎𝑥
𝑦 − 81 = 0.785 (𝑥 − 92)

𝑦 = 0.785𝑥 + 8.78

𝑟 = √𝑏𝑦𝑥 𝑏𝑥𝑦 .= √(0.785) (1.1746) = 0.9602


Since 𝑏𝑦𝑥 and 𝑏𝑥𝑦 are positive, 𝑟 is positive, 𝑟 = 0.9602 .

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 22
Examples:1 Find the regression coefficients 𝑏𝑦𝑥 𝑎𝑛𝑑 𝑏𝑥𝑦 ,find the correlation coefficient
between 𝑥 𝑎𝑛𝑑 𝑦 for the following data:

x 7 4 8 6 5

y 6 5 9 8 2

Ans: 𝑏𝑦𝑥 = 1.2, 𝑏𝑥𝑦 = 0.4, 𝑟 = 0.693

Examples:2 Find the two regressions from the following data and hence find the
correlation coefficient.

x 25 22 28 26 35 20 22 40 20 18

y 18 15 20 17 22 14 16 21 15 14

Ans:𝑦 = 0.385𝑥 + 7.344, 𝑥 = 2.227𝑦 − 12.704

Examples:3 Find the two regressions from the following data and hence find the
correlation coefficient and estimate 𝑦 𝑓𝑜𝑟 𝑥 = 73.

x 70 72 74 76 78 80

y 163 170 179 188 196 220

Ans:𝑦 = 5.31𝑥 − 212.57, 𝑦 = 175.37

Examples:4 Find the regression coefficients 𝑏𝑦𝑥 𝑎𝑛𝑑 𝑏𝑥𝑦 ,find the correlation coefficient
between 𝑦 𝑜𝑛 𝑥 for the following data:

x 2 4 6 8

y 1 2 2.5 3

Ans: 𝑥 = 0.325𝑦 + 0.5

Silver Oak University-CE/IT Degree Engineering-3rd Sem-Maths IV-Unit 4-Correlation and Regression-Dr. Moksha Satia Page | 23

You might also like