Tentative schedule
Sessions Topic coverage
1 Course preliminaries
2 Measures of central tendencies
3&4 Probability concepts
5&6 Random variables
7, 8, & 9 Discrete and continuous distributions
10, 11 & 12 Correlation and regression
13 Sampling
14 Hypothesis testing
15 & 16 Statistical process control
17 & 18 Decision analysis
19 & 20 Time series methods
Evaluation components
Quizzes (3 best 30% Will be held at the end of 5th,
out of 4) 10th, 15th & 19th session
Case analysis (2 10% 1. Individual submission;
cases) submit as 2. Structure will be discussed at
a later stage
assignment
Mid term 25%
examination
End term 35%
examination
Data
Mid term marks obtained by PGP 2017-19
batch in QM-1 mid term
Data: Raw and unorganized form
Information: Inferences obtained from data
Variable – (Vary + able) - A range of observed values
✓ Gender
✓ Height
✓ Weight
✓ Age
Value: Female Value: Male
Competitiveness of a financial product
Potential variables
• Rate of return (Negative, positive)
• Level of risk (Low, Medium, High)
• Planning horizon (Short term, intermediate terms, Long term)
• Tax implications (Tax exempt, Tax free)
• Sector (Manufacturing, Services, Pharma, Aviation)
Data are the observed value of a VARIABLE
Quantitative/Numerical data: Real numbers (height, age etc.)
Qualitative/Categorical data: Single, Married, Divorced
Ordinal data: “Poor”; “Fair”, “Good”; “Very good”; “Excellent”
Nominal data: 0657 2424387/011 2539009
831004/ 492015
Population: All possible observations together.
Sample: Subset of the population
Population Sample
Measures of central tendencies
Mean (average): (Summation of all the data)/ Number of data
µ (Population)
(Sample)
Sample mean vs. population means
Salary hike conundrum
Average industry salary (per
month): 25,000
Programmer 1: 30, 000/-
Programmer 2: 28, 000/-
Programmer 3: 32, 000/-
Programmer 4: 5, 000/-
Programmer 5: 5, 000/-
Boss says “if our programmer’s
average salary” is lower than
industry average, then only I will
give raise otherwise not??????
Mean
Merits Demerits
Considers all Vulnerable to
data point Outliers
Median: the middle value
Programmers Salary (per Salary (per
month) month)
Programmer 1 30, 000/- 5, 000/-
Programmer 2 28, 000/- 5, 000/-
Programmer 3 32, 000/- 28, 000/-
Programmer 4 5, 000/- 30, 000/-
Programmer 5 5, 000/- 32, 000/-
Arrange in increasing order
When n (number of observations)is odd, Salary (per
then month)
5, 000/-
Median = (n+1)/2 th observation
5, 000/-
28, 000/-
30, 000/-
32, 000/-
Programmers Salary (per
month)
When n (number of observations) is
Programmer 1 30, 000/-
even, then
Programmer 2 28, 000/-
Median = Average of (n)/2 th and
Programmer 3 32, 000/-
(n+2)/2 th observation
Programmer 4 5, 000/-
Programmer 5 5, 000/-
Programmer 6 37, 000/-
Median ???
Median
Merits: Not affected by the outliers
Demerits : Can not draw inferences
Mode
The most common occurring values
✓ Can have more than one mode
✓ It is also possible to have no mode at all
Car Crash test
In a crash test, 11 cars were tested to determine what impact
speed was required to obtain minimal bumper damage. Find the
mode of the speeds given in miles per hour below.
24, 15, 18, 20, 18, 22, 24, 26, 18, 26, 24
A marathon race was completed by 5 participants. What is the
mode of these times given in hours?
2.7 hr, 8.3 hr, 3.5 hr, 5.1 hr, 4.9 hr
Grouping
{4, 7, 11, 16, 20, 22, 25, 26, 33}
0-9: 2 values (4 and 7)
10-19: 2 values (11 and 16)
20-29: 4 values (20, 22, 25 and 26)
30-39: 1 value (33)
Mode
Merits
- It is not affected by extreme value
- Lists out the most frequent values
Demerits
- Subject to sample fluctuations
Shoe making industry
The number of sick days due to cold and flu last year was
recorded by a sample of 15 adults. The data are as follows
5 7 0 3 15 6 5 9 3 8 10
6 2 0 12.
Compute the mean, median, and mode
Exercise
An analysis of the monthly incentives received by 5 salesmen.
The mean and median of the incentives is $7000. The only mode
among the observations is $12,000. Incentives paid to each
salesman were in full thousands. What is the difference between
the highest and the lowest incentive received by the 5 salesmen
in the month?
Returns in a quarter (population)
Quarter Fund 1 Fund 2 Fund 1 vs Fund 2
(Rs.) (Rs.) 1800
1600
1 1000 700 1400
1200
1000
2 1100 1300 800
600
3 900 1600 400
200
4 1000 400 1 2
Fund 1 (Rs.)
3
Fund 2 (Rs.)
4
Total 4000/- 4000/-
Range
Returns in a quarter (population)
Quarter Fund 1 Fund 2
(Rs.) (Rs.) Range = (Largest observation –
Smallest observation)
1 1000 700
2 1100 1300
3 900 1600 Range for fund 1??
4 1000 400 Range for fund 2??
Total 4000/- 4000/-
Range
Data set 1
(200, 300, 150, 450, 500)
Data set 2
(200, 300, 150, 450, 500, 10 )
Variance
Population variance Sample variance
Returns in a quarter
(population) µ (xi - µ) (xi - µ)2
Quarter Fund 1 Fund 2
x1 1000 1000 0 0
(Rs.) (Rs.)
1 1000 700 x2 1100 100 10000
2 1100 1300 x3 900 -100 10000
3 900 1600 x4 1000 0 0
4 1000 400
20000
Total 4000/- 4000/-
Variance (fund 1) = 20000/4 = 5000
Exercise
Returns in a quarter
(population)
Calculate variance of fund 2.
Quarter Fund 2
(Rs.) 225,000
1 700
2 1300
3 1600
4 400
Std. Dev. Std. Dev.
σ= 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 2 = 𝜎 2 s= 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 2 = 𝑆 2
Population Sample
Covariance
Economic growth rate Stock market appreciation rate
Economic Stock market
growth rate appreciation (%)
(%)
Sample data
X Y sxy = Sample Covariance
3 7
5 12
1 5
= Population Covariance
Sx = ?? Sy = ?? =
Sxy ??
Covariance
✓ Indicates only type of linear relationship i.e. positive or
negative
✓ Does not indicate the extent or strength of relationship
The correlation between two variables X and Y is a measure of
the degree of linear association between the two
▪ r = Coefficient of correlation
▪ ρ = Coefficient of correlation (population)
▪ r = sxy / sx . sy; ρ = σxy / σx . σy
▪ -1 < r < 1; -1 < ρ < 1
Interpretation of r or ρ
✓ Value is 0 (No correlation)
✓ Value is 1 (Perfect positive correlation)
✓ Value is -1 (Perfect negative correlation)
Exercise
Variable X Variable Y Correlation
Marks obtained in Number of quality Positive or negative
QM-1 course hours devoted ??
Savings at the end of Expense occurred Positive or negative
the month during the month ??
Net weight reduced in Number of quality Positive or negative
a month hours devoted for ??
physical exercise in a
month
Fuel prices in a year Your monthly fuel
expense
Consider the following population data.
# Find the value of ρ and covariance for the data set. Comment
on the results
X Y
8 25
9 20 1. Coefficient of correlation (-0.988)
7 30
4 70
2 100
Random experiment
- An action that leads to a number of possible outcomes
Total number of
Random experiment Possible outcomes
possible outcomes
Head and Tail 2
Flipping a fair coin
Total number of
Random experiment Possible outcomes
possible outcomes
1, 2, 3, 4, 5, 6 6
Rolling a dice
A+; A; A-; B+;
12
B; B-; C+; C;
C-; D; F; I
Grade in a course at IIM-Raipur
In order to calculate the probability
1. The listed outcomes must be exhaustive (all possible outcomes must
be counted)
2. The outcomes must be mutually exclusive (no two or more outcomes
can occur at the same time)
A B C
A B A C B C
A B C
Flipping a coin Rolling a dice
1. Mutually exclusive
2. Exhaustive
Sample space: List of all possible outcomes of the experiment
Sample space
(1, 2, 3, 5, 6)
Is this sample space correct??
Probability: Likelihood of a favorable event from the sample
space.
p (Head) = Number of heads obtained /
Total number of exhaustive outcomes
For a single flip = 1/2
✓ Suppose there are N different events within a sample space S.
✓ Event Ei is the ith event. Each event is mutually exclusive of
each other.
✓ S є (E1, E2……..Ei…………EN).
✓ 0 < p (Ei) < 1
✓ p(E1) + p(E2) + ……………….p(Ei) + ………p(EN) = 1
p (getting 2) = Number of 2 obtained/
Total number of exhaustive outcomes
= ???
For a single roll
p (winning) = ???
Winning, losing and tie is equally likely
For a single game for a given team
Multiple random experiment
First coin Second coin
(H, H) Find ???
1. p (Exactly one head)
(H, T) 2. p (Exactly one tail)
3. p (at least one head)
(T, H)
4. p (exactly two tail)
(T, T)
All possible outcomes
Some basic relationships of probability
Complementary relationship
Sample space
A Ac
Complement of Event A
Event A
p(A) + p(Ac) = 1
Sample space a given day
(Rain, No rain)
Rain No Rain
p(Rain) + p(No Rain) = 1
On a given day, the probability of raining is 0.4
Find the probability of no rain in that day??
Winning Tie
Losing
Assuming game takes
place
The combined probability of a team winning or tieing a game is
0.87. Find the probability of team losing???
Addition Law
Sample space
A B
Event A Event B
Event A and B
p(A U B) = p(A) + p(B) – p(A ∩ B)
Consider a case of small assembly plant with 50 employee.
Each worker is expected to complete the work assignments on
time and in such a way that the assembled product will pass a
final inspection. On occasions, some of the workers fails to
meet the performance standards by completing work late on
assembling a defective product. At the end of a performance
evaluation period, product manager found that 5 out of 50
workers completed the work late. 6 out of 50 workers
assembled a defective product. 2 out of 50 workers both
completed work late and assembled the defective product.
Find out the probability that the production manager
assigned a poor rating??
Among all the employee who left an organization, a study
showed that 30% of the employee left the firm due to the fact
that they were unsatisfied with the salary. 20% left because they
were not satisfied with job role. 12% were dissatisfied with both
job role and salary.
Find the probability that an employee for reasons other than
salary and job role???
Sample space
A B
p(A ∩ B) = 0
(No Intersection)
Event A Event B
p(A U B) = p(A) + p(B) -p(A ∩ B)
p(A U B) = p(A) + p(B)
For its new car prototype, Tata Motors have to procure engines
from three engine manufacturers. The three manufacturers are
Cummins, Detroit Diesel, and Caterpillar. Tata Motors can
purchase only from a single source.
The probability of purchase from Cummins and Detroit Diesel is
.2 and .45 respectively.
Find out the probability that Tata Motors purchases engine
from Caterpillar????
Conditional probability
Large fleet owner
- Tata Motors trucks
- Non Tata Motors trucks
p (Tata Motors trucks) = p (TM)
p (Non Tata Motors trucks) = p (NTM)
Contingency Non
table Cummins
Engine
p (TM∩C) p (TM∩NC)
Non Tata p (NTM∩C) p (NTM∩NC)
Motors
Cummins Non Total p(C) = 960/1200
Engine Cummins = 0.8
Engine
p(NC) = 240/1200
Tata 288 36 324
= 0.2
Motors
Non Tata 672 204 876 p(TM) = 324/1200
Motors = 0.27
Total 960 240 1200 p (NTM)=876/1200
=0.73
Cummins Non Total Joint probabilities
Engine Cummins
Engine p (TM∩C)
Tata 288/1200 36/1200 0.27 p (TM∩NC)
Motors = 0.24 = .03
p (NTM∩C)
Non Tata 672/1200 204/1200 .73
Motors = .56 = 0.17 p (NTM∩NC)
Total 0.8 0.2 1
Marginal probabilities Joint probabilities
Cummins Engine Non Cummins Engine
Tata Motors p (TM∩C) = 0.24 p (TM∩NC) = 0.3
Non Tata Motors p (NTM∩C) = 0.56 p (NTM∩NC) = 0.17
p (TM/C) = p (TM∩C)/ p (c) = ??? p(C) = 0.8;
p(NC) = 0.2;
p(NTM/C) = p (NTM∩C)/ p (c) = ??? p(TM) = 0.27;
p (NTM)=0.73
p (TM/NC) = p (TM ∩NC)/ p (NC) = ???
p(NTM/NC) = p (NTM ∩NC)/ p (NC) = ???
p (A / B) = p (A ∩ B)/ p(B) Men Women
Promoted 500 300 800
p (B / A) = p (A ∩ B)/ p(A) Not - 500 -
promoted
Total 700 - 1500
a.Find p(Promoted/Women) ???
a.Find p (Not promoted/Men) ???
Annual Performance Appraisal
A certain portfolio of a financial firm has a number of mutual
fund and non-mutual fund offerings. Both of these offerings
are of two types i.e. tax exempt and non-tax exempt. The
number of mutual fund and non-mutual fund offerings are
50 and 150 respectively. The number of tax exempt and non-
tax exempt fund are 50 and 150 respectively. The mutual
fund type that is also tax exempt is 20. Find the following
a) Probability that certain customer chooses a mutual fund
given it is tax exempt.
b) Probability that certain customer chooses a non-mutual
fund given it is non tax exempt.
A Forging company makes only two types of axles namely Rear
Axle and Front Axle. In the last financial year, the number of
Rear axles manufactured was 8, 000 out of the total 20, 000
units manufactured. The company deploys two types of
manufacturing processes namely Forging and Austenitic Ductile
Iron (ADI) Casting for producing the two type of axles. In the last
financial year, company had a production capacity of 20, 000
axles from Forging process. However, the company could utilize
only 80% of given capacity. Based on the inventory reconciliation
exercise, it was found that probability of deploying Forging as a
manufacturing process given for producing Rear axle was 0.5.
Find the number of Front Axles that was produced using ADI.
Each year, ratings are complied concerning the performance of new
cars during first 90 days of use. Cars are categorized according to
if the car needs warranty related repair
If the car is manufactured in USA
Based on the data collected, it was found that
P(Car needs warranty repair) = 0.04
P (Car manufactured in USA) = 0.60
P (needs warranty repair and manufactured in USA) = 0.025
1. Make contingency table
2. Obtain the probability that
a. A new car needs warranty repair or manufactured in USA
b. New car needs warranty repair given not manufactured in USA
c. New car does not need repair given it is manufacturer in USA
In a lot of used cars
70% have air conditioning (AC)
40% have a CD player (CD)
20% of the cars have both.
1. Make contingency table?
2. What is the probability that a car has a CD player,
given that it has AC ?