Module 4

waaa

Uploaded by

iamfathimahanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views95 pages

Module 4

waaa

Uploaded by

iamfathimahanan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 95

Module IV

Introduction to Correlation and Regression

• Correlation – Meaning - Positive, negative and zero
correlation, Correlation through Scatter diagrams,
Interpretation of Correlation Co-efficient, Simple and
Multiple Correlation; Regression
Correlation
Introduction to Correlation Analysis
• In previous studies, the focus was on univariate analysis, which deals
with one variable at a time. This includes measures like central
tendency, dispersion, skewness, and kurtosis.
• However, there are situations where two variables need to be studied
together, such as height and weight of individuals or sales revenue and
advertising expenditure of a company. This type of analysis, where two
variables are studied together, is known as bivariate analysis. If more
than two variables are studied, it becomes multivariate analysis.
• The goal of correlation analysis is to find if a relationship exists
between two variables and to measure the strength of this relationship.
Definitions and Uses of Correlation
• Croxton and Cowden define correlation as a statistical tool used
to measure and express the relationship between two variables in
a concise formula.
• A.M. Tuttle refers to correlation as the study of how two variables
co-vary (change together).
• W.A. Neiswanger highlights that correlation helps understand
economic behavior and identify important variables.
• Tippett states that correlation reduces uncertainty in prediction
by showing how one variable influences the other.
Types of Correlation
Positive and Negative Correlation:
➢Positive correlation occurs when two variables move in the same direction.
For example, if an increase in one variable leads to an increase in the other,
such as height and weight, or family income and luxury spending.
➢Negative correlation occurs when the variables move in opposite directions.
For example, an increase in the price of a commodity typically leads to a
decrease in demand, or a rise in temperature reduces the sale of woolen
garments.
Linear and Non-Linear Correlation:

➢Linear correlation is when a constant change in one variable leads to a

constant change in the other. For instance, the example where for every
increase of 1 in variable "x," there is a constant increase of 2 in "y."
➢Non-linear (curvilinear) correlation occurs when the change in one
variable does not result in a constant change in the other. Instead, the change
fluctuates, and the graph of such data would not be a straight line.
Correlation and Causation:
➢Correlation measures the degree to which two variables move together but
does not necessarily imply that one variable causes the other.
➢Causation means that changes in one variable directly cause changes in
another. While causation implies correlation, the reverse is not true. High
correlation might exist due to mutual dependence, influence from external
factors, or pure chance (spurious correlation).
Spurious Correlation:
• Sometimes, two variables may show a high degree of correlation
even though they are not related. For example, the size of a
person’s shoe and their intelligence may show correlation in a
small, randomly selected sample, but the relationship is
nonsensical. This is called spurious correlation or nonsense
correlation.
Methods of Studying Correlation:
Several methods are commonly used to study the relationship between
two variables:
• Scatter Diagram Method: A graphical representation where the relationship
between two variables is plotted as points in a graph.
• Karl Pearson’s Coefficient of Correlation: A numerical method that measures
the strength and direction of a linear relationship between two variables. This
method uses covariance to calculate correlation.
• Two-Way Frequency Table (Bivariate Correlation Method): A table that
records the frequency of different combinations of two variables.
• Rank Correlation Method: This method ranks the data for each variable and
then measures the correlation between the ranks.
• Concurrent Deviations Method: A simplified method that examines the
direction of changes in the two variables to determine whether they are
correlated.
Scatter Diagram Method
What is a Scatter Diagram?
• A scatter diagram is a plot of n pairs of values
of two variables (e.g., height and weight of individuals).
• Each pair is plotted as a point on a graph, with one variable on the
x-axis and the other on the y-axis.
• The variable plotted on the x-axis is typically the independent
variable, while the variable on the y-axis is the dependent
variable.
How to Interpret a Scatter Diagram?
The scatter diagram helps in visually interpreting the relationship
between two variables:
• Density of Points:
• If the points are close together, it suggests a high correlation between
the two variables.
• If the points are scattered, it indicates a low correlation.
• Trends:
• If the points follow a clear upward trend, it suggests positive correlation,
meaning that as one variable increases, the other also increases.
• If the points show a downward trend, it suggests negative correlation,
meaning that as one variable increases, the other decreases.
• If there is no clear trend, it suggests no correlation between the variables.
• Perfect Correlation:
• Perfect positive correlation: All points lie exactly on a straight line going
from the bottom-left to the top-right.
• Perfect negative correlation: All points lie on a straight line from the
top-left to the bottom-right.
Different Forms of Correlation on
Scatter Diagrams:
The diagram shows various forms of correlation:
• Perfect positive correlation: A straight line from the bottom-left to the top-
right.
• Perfect negative correlation: A straight line from the top-left to the bottom-
right.
• High degree of positive correlation: Points form a dense, upward trend but
may not be perfectly aligned.
• High degree of negative correlation: Points form a dense, downward trend.
• Low degree of positive/negative correlation: Points show an upward or
downward trend but are more spread out.
• No correlation: Points are scattered randomly with no discernible trend.
Advantages and Disadvantages of the
Scatter Diagram Method:
• Advantages:
• Easy to understand: The method is visually intuitive and can give a rough idea
of the nature of the relationship between two variables by simply inspecting the
graph.
• Resistant to extreme observations: Unlike mathematical methods, scatter
diagrams are not significantly influenced by extreme values or outliers.
• Disadvantages:
• Not precise: The scatter diagram only provides a rough idea of whether the
correlation is positive, negative, strong, or weak. It does not give an exact
measure of the correlation.
• Not suitable for large datasets: When there are many data points, the diagram
becomes cluttered and difficult to interpret.
KARL PEARSON’S
COEFFICIENT OF
CORRELATION
(COVARIANCE METHOD)
Karl Pearson's Coefficient of Correlation
• Karl Pearson's Coefficient of Correlation is a widely used
mathematical method to measure the strength and direction of the
linear relationship between two variables, denoted as r. This
coefficient ranges from -1 to +1, where:
• +1 indicates a perfect positive linear correlation,
• -1 indicates a perfect negative linear correlation, and
• 0 indicates no linear relationship between the variables.
Formula Overview
1 st step -Covariance

In statistics, a variance is the spread of a data set around its mean value, while a covariance is the measure of the
directional relationship between two random variables.
2 nd step- Standard deviation
Correlation Coefficient
• Pearson’s Correlation Coefficient (r): After calculating the
covariance and the standard deviations, Pearson's coefficient is:

• Simplified Formula
Illustration 1
Illustration 2
Determine the coefficient of correlation for the following data
Illustration 3
Determine the coefficient of correlation for the following data:
Illustration 4
Determine the coefficient of correlation for the following data:

X3 2 1 5 4
Y8 4 10 2 6
Illustration 5
• Find Karl Pearson's coefficient of correlation from the following
index numbers and interpret it.

Wages (₹) 100 101 103 102 100 99 97 98 96 95

Cost of living 98 99 99 97 95 92 95 94 90 91
Spearman’s Rank
Correlation Coefficient (ρ)
Spearman’s Rank Correlation Coefficient (ρ)

• Rank Correlation Method is used when the variables under

study cannot be measured quantitatively but can be ranked based
on qualitative attributes like intelligence, beauty, or honesty.
• This method is especially helpful when we want to see if two sets
of rankings are related.
• Spearman’s Rank Correlation Coefficient (ρ)
• Spearman's Rank Correlation Coefficient, denoted by ρ(rho),
measures the correlation between the ranks of two variables.
Formula
Illustration 6
Illustration 7
• Solution. Let X denote the advertisement cost (’000 Rs.) and Y denote
the sales (lakhs Rs.)
Illustration 8
Linear Regression Analysis
Meaning
• The term regression in statistics refers to a method used to study
the relationship between two or more variables.
• It allows us to predict the value of one variable (the dependent
variable) based on the known value of another variable (the
independent variable). Essentially, it helps estimate how one
variable changes as another variable changes.
Regression answers questions like:
• How does sales change with an increase in advertising spending?
• How does income change with years of education?
Types of Regression
• Simple regression: Involves two variables — one dependent and
one independent variable.
• Multiple regression: Studies the impact of multiple independent
variables on a dependent variable.
More about regression:

• Dependent variable (Y): The variable you are trying to predict.

• Independent variable (X): The variable used to make
predictions.
• For example, in linear regression, the relationship between
variables is represented by a straight line. The formula typically
used is:
Linear and Non-Linear Regression
• Linear regression occurs when the relationship between two
variables is represented by a straight line. The equation of this line
is given by where a is the intercept and b is the slope.
• Non-linear regression involves more complex relationships
between variables, where the equation includes higher-degree
terms like
Lines of Regression
Regression Equation of Y on X

OR
Regression Equation of X on Y

OR
Example Scenario
• Imagine you want to analyze the relationship between hours studied
and scores obtained in an exam. You collect the following data:
Regression Line
The regression line is a mathematical model that predicts the
dependent variable (exam score) based on the independent
variable (hours studied). The equation of the regression line can
be expressed as:
Illustration 9
• From the following data find two regression equation
Illustration 10
Coefficient of determination
• The coefficient of determination, denoted as R², is a statistical
measure that indicates how well a regression model fits the data.
• It represents the proportion of the variance in the dependent
variable that is predictable from the independent variable(s).
• In simpler terms, it tells us how well the independent variables
explain the variability of the dependent variable.
Illustration 11
• Compute the appropriate regression equation for the following data.
Illustration 12
Using Regression for ANSWER
Prediction 1)company’s marketing department has decided to spend
With these models estimate: Rs.2,50,000/- (X = 2.5) on advertisement during the next
i) the value of sales when the quarter, the most likely estimate of sales
company decided to spend = 16.15 + 5.8 (2.5) = 30.65
Rs. 2,50,000 on advertising, = Rs. 30,65,000
and 2)when company desires to get the target of Rs. 50 lakhs
ii) the cost of advertisement sales during next quarter, the most likely estimation of
when the company desires advertisement cost = −0.25 + 0.093 (50)
to reach the target of Rs. 50 = −0.25 + 4.65 = 4.4
Lakhs during the next = Rs. 4,40,000.
quarter.
Also called Least Squares Regression
• Least Squares Regression is a statistical technique that aims to find the
line of best fit through a set of data points by minimizing the sum of the
squares of the vertical distances (the residuals) between the observed
data points and the points on the fitted line
How It Works:
• The formula for bxy or bxy (the regression coefficients) minimizes the
error (or residuals) in the vertical direction for one variable (like Y)
when trying to predict it from another variable (like X).
• The idea behind least squares is to ensure that the total squared
difference between the actual values and the predicted values (on the
regression line) is as small as possible.
• Y on X regression finds the best-fitting line to predict Y from X.
• X on Y regression finds the best-fitting line to predict X from Y.
• In both cases, we are using least squares to minimize the sum of the
squared errors between the observed values and the values
predicted by the regression line.
• where the goal is to minimize the difference between the actual
data points and the line of best fit.
Standard Error (SE)
• The standard error (SE) is a measure that describes how much
sample means differ from the actual population mean.
• It provides insight into the precision of the sample mean as an
estimate of the population mean.
• In other words, it tells us how much we might expect the sample
mean to vary if we were to take multiple samples from the same
population.
Formula
• The formula for the standard error of the mean (SEM) is:
where:

• σ = is the standard deviation of the population,

• N= is the sample size.
• If the population standard deviation (σ) is unknown, the sample
standard deviation (s) is often used instead, especially in small
samples.
Index numbers
Index numbers
• Index numbers are tools that help us understand how certain
things (like prices or production levels) change over time.
• They allow us to compare changes in a specific period (called the
"current period") with a past period (called the "base period").
• The things we measure can be prices of items (like gold, steel, or
milk), production levels (like how much a factory produces), or
even broader topics like national income or the cost of living.
• The main types of index numbers include:
1.Price Index: Measures changes in the price level of goods and
services over time. Examples include the Consumer Price Index
(CPI) and the Producer Price Index (PPI).
2.Quantity Index: Tracks changes in the quantity or output levels,
commonly used to measure production volume in industries.
3.Value Index: Measures changes in the total value of transactions,
combining both price and quantity indices, which can be
particularly useful for analyzing economic growth.
• Key Points:
1.Relative Changes: Index numbers show how much something has
increased or decreased compared to a past time.
2.Different Variables: They can measure changes in prices,
production, wages, or economic factors over time.
3.Simplifying Complex Changes: Since different items (like rice,
milk, or fuel) have prices measured in different units (kilograms,
liters, etc.), it’s hard to compare them directly. Index numbers
provide a single number that summarizes the overall change,
making it easier to understand.
Example
• If you want to know how prices of everyday goods (like food, fuel,
and clothing) have changed over the last five years for low-income
families, you can't just look at one or two prices because some may
have gone up while others went down. An index number helps you
get a general idea of how prices have changed overall.
• In simpler terms, index numbers give us a snapshot of change in
various things over time, helping us track and compare these
changes in a straightforward way.
METHODS OF CONSTRUCTING INDEX
NUMBERS
Simple Aggregate Method.
Fisher's Ideal Index
• Fisher's Index, also known as Fisher's Ideal Index, is a type of
index number that combines two other popular indices: the
Laspeyres Index and the Paasche Index.
• It provides a balanced or "ideal" measurement of price changes by
taking the geometric mean of these two indices.
Formula:
Illustration 12
On the basis of the following information, calculate Fisher's
index number:
Solution
Illustration 13
Construct Fisher’s Ideal Index Number using the data
given below.
Time Reversal Test and Factor Reversal
Test
Time Reversal Test and Factor Reversal Test are two important consistency checks for
index numbers, particularly when measuring price and quantity changes over time.
Time Reversal Test
• The Time Reversal Test checks whether the index number remains consistent if we
reverse the time periods.
• In simple terms, if you calculate an index from the base period to the current period
and then reverse it (from the current back to the base), the product of these two
indexes should equal 1 (or 100 if we express it as a percentage).
For example:
• If the price index from period 0 to period 1 is 120, the index from period 1 back to
period 0 should ideally be 100 / 120 = 0.8333 (or 83.33).
• The Time Reversal Test is passed if the multiplication of these two indexes results in
1.
• This test helps confirm that the index number does not depend on the direction of
time.
2. Factor Reversal Test:
• The Factor Reversal Test checks whether an index number
gives consistent results when we swap the roles of prices and
quantities (factors) while calculating value changes.
• In other words, the product of a price index and a quantity
index should equal the value index (which measures total
monetary value).
• According to the Factor Reversal Test, should give the value
index, which reflects the overall change in value over time.
• This test ensures that both price and quantity changes are
accurately captured together in a combined value measure
University questions
Rank correlation
Correlation versus covariance
Coefficient of determination
Standard error
What is index numbers explain the concept and uses
Explain the regression analysis. What is the basic form of a regression
equation? Interpret the terms in the equation.
What do dependent and independent variable indicate in a simple
regression. Explain with the help of an example.

Understanding Correlation Types & Methods
No ratings yet
Understanding Correlation Types & Methods
85 pages
Measures of Correlation
No ratings yet
Measures of Correlation
23 pages
Correlation and Regression
No ratings yet
Correlation and Regression
71 pages
CORRELATION
No ratings yet
CORRELATION
72 pages
CORRELATION & REGRESSION Notes For Mba
100% (1)
CORRELATION & REGRESSION Notes For Mba
62 pages
Unit 4 Correlation and Regression
No ratings yet
Unit 4 Correlation and Regression
34 pages
Business Statistics Unit 4 Correlation and Regression
No ratings yet
Business Statistics Unit 4 Correlation and Regression
27 pages
Economics Class 11 Notes Chapter Correlation
50% (4)
Economics Class 11 Notes Chapter Correlation
4 pages
Fds Unit III Notes
No ratings yet
Fds Unit III Notes
23 pages
Correlation Analysis Guide
No ratings yet
Correlation Analysis Guide
13 pages
Peter
No ratings yet
Peter
48 pages
Correlation Notes - Module3
No ratings yet
Correlation Notes - Module3
7 pages
Correlation 1
100% (1)
Correlation 1
57 pages
CORRELATION
No ratings yet
CORRELATION
61 pages
Understanding Correlation Basics
100% (1)
Understanding Correlation Basics
78 pages
Correlation and Regression
100% (1)
Correlation and Regression
17 pages
Lecture Sheet H
No ratings yet
Lecture Sheet H
17 pages
Measures of Correlation
No ratings yet
Measures of Correlation
22 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
009 D 1 Correlation
No ratings yet
009 D 1 Correlation
29 pages
Maths and Statistical Analysis
No ratings yet
Maths and Statistical Analysis
56 pages
Correlation Concepts & Methods
No ratings yet
Correlation Concepts & Methods
43 pages
Correlation
No ratings yet
Correlation
64 pages
5 - Correlation Analysis
No ratings yet
5 - Correlation Analysis
34 pages
Correlation
No ratings yet
Correlation
18 pages
Correlation Analysis Guide
No ratings yet
Correlation Analysis Guide
83 pages
QT - Unit 2 - Part A - Correlation
No ratings yet
QT - Unit 2 - Part A - Correlation
48 pages
Business Project 12 Content
No ratings yet
Business Project 12 Content
33 pages
Correlation 180511133644
No ratings yet
Correlation 180511133644
12 pages
Correlation and Regression
No ratings yet
Correlation and Regression
45 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Correlation
No ratings yet
Correlation
22 pages
Business Statistics Unit 4 Correlation and Regression
No ratings yet
Business Statistics Unit 4 Correlation and Regression
27 pages
Understanding Correlation Types
No ratings yet
Understanding Correlation Types
27 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
29 pages
Correlation
No ratings yet
Correlation
7 pages
Correlation 180511133644
No ratings yet
Correlation 180511133644
12 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
Correlation
No ratings yet
Correlation
2 pages
Correlational Analysis - Statistics - Alok - Kumar
No ratings yet
Correlational Analysis - Statistics - Alok - Kumar
42 pages
Types of Correlation and Their Specific Applications
No ratings yet
Types of Correlation and Their Specific Applications
25 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Business Statistic-Correlation and Regression
No ratings yet
Business Statistic-Correlation and Regression
30 pages
Chapter 6-Correlation Analysis
No ratings yet
Chapter 6-Correlation Analysis
35 pages
Unit II - Correlation
No ratings yet
Unit II - Correlation
28 pages
Fundamentals of Forecasting Using Excel: Dr. Kenneth D. Lawrence Dr. Ronald K. Klimberg Dr. Sheila M. Lawrence
No ratings yet
Fundamentals of Forecasting Using Excel: Dr. Kenneth D. Lawrence Dr. Ronald K. Klimberg Dr. Sheila M. Lawrence
7 pages
Module 5
No ratings yet
Module 5
112 pages
Understanding Correlation Basics
No ratings yet
Understanding Correlation Basics
27 pages
Module 1
No ratings yet
Module 1
102 pages
Correlation KDK DHH W
No ratings yet
Correlation KDK DHH W
16 pages
Statistics Enthusiasts: MAE Explained
No ratings yet
Statistics Enthusiasts: MAE Explained
2 pages
Department OF Business and Industrial Management: Presentation On Correlation
No ratings yet
Department OF Business and Industrial Management: Presentation On Correlation
29 pages
Initially Developed by Sir Francis Galton (1888) and Karl Pearson (1896)
No ratings yet
Initially Developed by Sir Francis Galton (1888) and Karl Pearson (1896)
10 pages
Correlation
No ratings yet
Correlation
17 pages
Stats With Python
75% (4)
Stats With Python
4 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
59 pages
Curve Fitting - Lecturers - 2
No ratings yet
Curve Fitting - Lecturers - 2
21 pages
Entrepreneurship Skills
No ratings yet
Entrepreneurship Skills
14 pages
Ucf Engineering Homework Guidelines
100% (3)
Ucf Engineering Homework Guidelines
14 pages
Surveying by DR Ramachandra
No ratings yet
Surveying by DR Ramachandra
338 pages
Demand Analysis
No ratings yet
Demand Analysis
21 pages
Introduction to Data Analysis Course
No ratings yet
Introduction to Data Analysis Course
7 pages
009 GTE Unit07-Principles of LSQs
No ratings yet
009 GTE Unit07-Principles of LSQs
28 pages
Correlation Notes
No ratings yet
Correlation Notes
9 pages
Ora Laboratory Manual: Section 4 Section 4
No ratings yet
Ora Laboratory Manual: Section 4 Section 4
25 pages
Census and Sample Method of Collection of Data.
No ratings yet
Census and Sample Method of Collection of Data.
8 pages
3.badm - Mba Notes
No ratings yet
3.badm - Mba Notes
13 pages
Ashok Kumar 2021
No ratings yet
Ashok Kumar 2021
7 pages
Chapter 11
No ratings yet
Chapter 11
18 pages
Iron Ore Analysis Certificate
No ratings yet
Iron Ore Analysis Certificate
18 pages
Chemistry Basics for Students
No ratings yet
Chemistry Basics for Students
38 pages
Fruit Powder Cookies
No ratings yet
Fruit Powder Cookies
7 pages
Econometrics Exam Answer Key
No ratings yet
Econometrics Exam Answer Key
8 pages
Breakfast Buffet
No ratings yet
Breakfast Buffet
11 pages
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
No ratings yet
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
33 pages
Remote Sensing
No ratings yet
Remote Sensing
27 pages
Automating First-Principles Phase Diagram Calculations
No ratings yet
Automating First-Principles Phase Diagram Calculations
21 pages
Car Sales Elasticity Analysis
No ratings yet
Car Sales Elasticity Analysis
27 pages
PG 1
No ratings yet
PG 1
38 pages
Relative Potency Estimation in Direct Bioassay With Measurement e
No ratings yet
Relative Potency Estimation in Direct Bioassay With Measurement e
16 pages
SK Services Impact in Ifugao
No ratings yet
SK Services Impact in Ifugao
11 pages
Python
No ratings yet
Python
4 pages
Stata: Efficient SUR with xtgee
No ratings yet
Stata: Efficient SUR with xtgee
7 pages
Logistic Regression Analysis in R
No ratings yet
Logistic Regression Analysis in R
7 pages
365 Data Science - Statistics: Glossary Section Lesson Word
No ratings yet
365 Data Science - Statistics: Glossary Section Lesson Word
5 pages
Notes On Linear Regression - 2
No ratings yet
Notes On Linear Regression - 2
4 pages

Module 4

Uploaded by

Module 4

Uploaded by

Module IV

Introduction to Correlation and Regression

➢Linear correlation is when a constant change in one variable leads to a

Wages (₹) 100 101 103 102 100 99 97 98 96 95

• Rank Correlation Method is used when the variables under

• Dependent variable (Y): The variable you are trying to predict.

• σ = is the standard deviation of the population,

You might also like