(2 :30 PM)
(Important)
PROBABILITY AND STATISTICS
Probability is a branch of mathematics that quantifies the likelihood of an event
occurring. It ranges from 0 (an impossible event) to 1 (a certain event). It helps in
making predictions based on known information.
Important Concepts
1. Sample Space (S): The set of all possible outcomes.
2. Types of Events:
o Independent Events: Events where the occurrence of one does not
affect the occurrence of another.
o Dependent Events: Events where the occurrence of one affects the
occurrence of another.
o Disjoint Events: Events that have no common outcomes. If A and
B are disjoint, then P ( A ∩ B ) F
0.
o Mutually Exclusive Events: Events that cannot happen
simultaneously.
o Complementary Events: If event A occurs, its complement A' does
not occur, and vice versa.
o Equally Likely Events: All outcomes have the same probability.
o Exhaustive Events: A set of events that covers all possible
outcomes.
3. Conditional Probability:
o The probability of an event occurring given that another event has
already occurred.
4. Probability with and without Replacement:
o With Replacement: Events remain independent.
o Without Replacement: Events become dependent.
Important Formulae
n ( A)
1. Probability of an Event: P ( A) F .
n(S )
2. For any event A , 0 ≤ P ( A) ≤ 1 .
3. For a sample space S , P ( S ) F 1 .
4. Addition Rule:
o For 2 events: P ( A ∪ BF) P ( A) + P ( B ) − P ( A ∩ B ) .
o For 3 events:
P ( A ∪ B ∪ CF) P ( A) + P ( B ) + P (C ) − P ( A ∩ B ) − P ( A ∩ C ) − P ( B ∩ C ) + P ( A ∩ B ∩ C )
5. Multiplication Rule:
o Independent Events: P ( A ∩ B ) F
P ( A) P ( B )
o Dependent Events: P ( A ∩ B ) F
P ( A) P ( B | A)
6. Bayes’ Theorem:
P ( B | A) P ( A)
o For 2 events: P ( A | B ) F
P (B )
P ( B | Ak ) P ( Ak )
o For n events: P ( Ak | B ) F n
∑ P (B | A )P ( A )
i F1
i i
n
7. Total Probability Theorem: P ( B ) F ∑ P ( B | Ai ) P ( Ai ) .
i F1
8. At Least and Exactly Probabilities:
o P ( at least 2 events )F P ( A ∩ B ) + P ( A ∩ C ) + P ( B ∩ C ) − 2P ( A ∩ B ∩ C )
P ( exactly 1 event ) FP ( A) + P ( B ) + P (C ) − 2P ( A ∩ B ) − 2P ( A ∩ C ) − 2P ( B ∩ C )
o
+3P ( A ∩ B ∩ C )
n
P ( k ) pk ( 1 − p ) .
n− k
9. Probability of exactly k successes in n trials:
F
k
E(X ) ∑ xP (X x ), Var ( )
( X ) E X 2 − ( E ( X ))
2
10. Expectation & Variance: F F F
Statistics:
The word statistics is derived from the Latin word ‘status’, which means a state.
Statistics refers to the process of collecting, analysing, interpreting, presenting,
and organising data. Statistics is used to analyse data by measuring central
tendencies (mean, median, and mode) and dispersion (range, variance, and
standard deviation). It plays a crucial role in various fields, including science,
economics, and engineering.
Types of Averages
Averages refer to different measures of central tendency, including:
Mean:
The arithmetic average of a data set.
∑x i
• Discrete Data: x F i F1
n
n
∑f x i i
• Grouped Data: x F i F1
n
where fi is the frequency and x i is the mid-point.
∑f
i F1
i
Median:
The middle value when the data set is arranged in ascending order.
For Ungrouped Data:
n+1
• If the number of terms is odd: Median F th term.
2
n n+1
• If the number of terms is even: Median = Average of th and th term.
2 2
For Grouped Data:
N
2 −c
Median F
l + × h where:
f
• l F lower limit of the median class
• f F frequency of median class
• h F width of the median class
• c F cumulative frequency of preceding median class
Mode:
Mode represents the most frequently occurring data point. For grouped data:
f1 − f0
l + × h where:
2f1 − f0 − f2
• l F lower boundary of the modal class
• f1 F frequency of modal class
• f0 F frequency before the modal class
• f2 F frequency after the modal class
• h F class width
Dispersion in Statistics
Dispersion measures the spread of data points from the central value.
Absolute Measures of Dispersion
F
1. Range: R Xmax − Xmin
2. Mean Deviation: MD F
∑x i −x
n
∑( x − x) ∑ x −(x)
2 2
2
3. Variance: σ 2
F i
F
or σ 2 i
.
n n
4. Standard Deviation: σ F σ 2
Relative Measures of Dispersion
Used for comparing different data sets.
Xmax − Xmin
1. Coefficient of Range:
Xmax + Xmin
MD MD
2. Coefficient of Mean Deviation: or .
Mean Median
σ
3. Coefficient of Variation (CV): CVF × 100 .
x
Combined formulas:
n1 x 1 + n2 x 2
1. Combined mean: x F
n1 + n2
n σ 2 + n2σ 22 n1 n2
( )
2
2. Combined variance: σ 2 F 1 1 + X1 − X2
( n1 + n2 )
2
n1 + n2
Addition and Multiplication Rules
Addition of a Constant
• Mean: If a constant is added to each data point, the new mean is x + c .
• Median: The new median will be the original median plus.
• Mode: The mode increases by c .
• Variance & Standard Deviation: These remain unchanged as dispersion
does not change.
• Coefficient of Range: Unchanged under addition.
• Mean Deviation: Unchanged under addition.
• Coefficient of Variation (CV): Changes unpredictably under addition.
Multiplication by a Constant
• Mean: If each data point is multiplied by a constant k , the new mean is kx .
• Median: The new median will be k × ( original median ) .
• Mode: The new mode will be k × ( original mode ) .
• Variance: The new variance will be k 2 ×σ 2 .
• Standard Deviation: The new standard deviation will be k × σ .
• Coefficient of Range: Unchanged under multiplication.
• Mean Deviation: Scales by k under multiplication.
• Coefficient of Variation (CV): Unchanged under multiplication.
Example 1: A company has HR, IT, and Finance teams. HR has 15 males and 10
females, IT has 25 males and 15 females, and Finance has 30 males and 25
females. If a randomly chosen employee is female, what is the probability that they
are from Finance?
Solution:
We need to find P ( F | Fe ) .
By the Baye’s theorem,
P ( F ) P ( Fe | F )
P ( F | Fe ) F
P ( H ) P ( Fe | H ) + P ( I ) P ( Fe | I ) + P ( F ) P ( Fe | F )
Since there are 3 sections, we have,
1
P (H ) F
3
1
P (I ) F
3
1
P (F ) F
3
From the given data,
10
P ( Fe | H ) F
25
15
P ( Fe | I ) F
40
25
P ( Fe | F ) F
55
So, we have,
1 25
3 55
P (Q | M ) =
1 10 1 15 1 25
+ +
3 25 3 40 3 55
25
= 55
880 + 825 + 1000
2200
1000
=
2705
200
=
541
Example 2: A box contains 3 fair coins and 2 biased coins (each biased coin lands
on heads with probability 0.75). A coin is picked at random and flipped twice. If it
lands heads both times, what is the probability that it was a biased coin?
Solution:
We need to find P ( B | HH ) .
From the given data, we have,
P ( HH | B ) F ( 0.75 )
2
2
P (B ) F
5
P ( HH ) P ( HH | B ) P ( B ) + P ( HH | F ) P ( F )
F
2
2 1 3
F ( 0.75 ) +
2
5 2 5
So, by Bayes’ theorem, we have,
2 2
( 0.75 ) 5 0.225 3
P ( B | HH ) ===
2
2 2 1 3 0.375 5
( 0.75 ) 5 + 2 5
Example 3: For an odd prime p, Sp is the set of all 2 × 2 matrices with only
elements from the set {0,1,2,...., p − 1} . Then, what is the probability that a
randomly chosen matrix in Sp has determinant 0 given that the trace is divisible by
p?
Solution:
Since the trace is divisible by p , for any non-zero element in the set
0 b a b
{0,1,2,...., p − 1} , the matrix is of the form c 0 or c p − a . The number of
possible values for a are p − 1 .
For b and c , the number of possibilities in total is p2 . The total number of
possibilities for a matrix such that the trace is divisible by p is p3 .
a b
For the matrix to have determinant 0, the matrices should be of the
c p − a
a a a p − a 0 0 0 b 0 0
form , a p − a , c 0 , 0 0 , 0 0 .
p − a p − a
The total number of possibilities for the matrix to have a determinant 0 and that
the trace is divisible by p is
( ( p − 1) × 2 ) + 2 ( p − 1) + 1 = 2 p − 2 + 2 p − 2 + 1
= 4p − 3
So, the probability that a randomly chosen matrix in S p has determinant 0 given
4p-3
that the trace is divisible by p is .
p3
Example 4: A four-digit number is chosen at random. What is the probability that
its digits are in strictly increasing order?
Solution:
The total number of four-digit numbers is 9999 − 999 = 9000 .
Since the digits are in increasing order, we cannot have the digit 0 in the number.
So, the digits can only be {1,2,3,4,5,6,7,8,9} .
Since the digits are strictly increasing, no number repetition is allowed, and only
one possible combination is possible for one set of 4 digits, the number of
possibilities of four-digit numbers with digits in strictly increasing order is
9× 8× 7 × 6
9c4 =
1× 2 × 3 × 4
= 126
126 7
The probability of selection is = .
9000 500
Example 5: Consider the set S = {2,3,5,7} . Let Q be the set of all 5-digit numbers
formed using the elements from the set S . Then, what is the probability that a
randomly chosen element from the set Q is divisible by 24 given that it is divisible
by 4 ?
Solution:
For a number to be divisible by 4 , the last 2 digits should be divisible by 4.
The possibilities for the digits are 32,52,72 . The number of possibilities for the first
3 digits are 43 = 64 .
The total number of possibilities for a number from the set to be divisible by 4 is
64 × 3 = 192
For a number to be divisible by 24 , the number should be divisible by 3 and 8
since gcd ( 3,8 ) = 1
For a number to be divisible by 8 , the last 3 digits should be divisible by 8 .
The possibilities for the digits are 232,272,352,552,752 .
For a number to be divisible by 3 , the sum of the digits must be divisible by 3.
The sum of the digits in each number are 7,11,10,12,14 . The remainders when
divided by 3 are 1,2,1,0,2 .
The first 2 digits should be decided based on this.
Summing 2 digits and finding the remainder gives
( 2,2 ) → 1
( 2,3 ) → 2
( 2,5 ) → 1
( 2,7 ) → 0
( 3,3 ) → 0
( 3,5 ) → 2
( 3,7 ) → 1
( 5,5 ) → 1
( 5,7 ) → 0
( 7,7 ) → 2
Mapping the 3-digit numbers based on the remainders, we have
552 → ( 2,7 ) ,( 3,3 ) ,( 5,7 )
232,352 → ( 2,3 ) ,( 3,5 ) ,( 7,7 )
272,752 → ( 2,2 ) ,( 2,5 ) ,( 3,7 ) ,( 5,5 )
The number of possibilities for the number to be divisible by 24 is
5 + ( 2 × 5 ) + ( 2 × 6 ) = 5 + 10 + 12
= 27
27 3
The probability is = .
192 64
Example 6: A standard deck of 52 playing cards is shuffled, and one card is drawn
randomly, with replacement, each time. What is the minimum number of draws
required to ensure the probability of getting at least two kings is at least 0.05?
Solution:
Let n represent the number of draws.
1
The probability of getting a king is p = .
13
12
The probability of not getting a king is q = .
13
0 n
n 1 12
The probability of getting no kings in the n trials is .
0 13 13
1 n−1
n 1 12
The probability of getting one king in the n trials is .
1 13 13
The probability of getting at least two kings in the n trials is
0 n 1 n−1
n 1 12 n 1 12
1 − − ≥ 0.05
0 13 13 0 13 13
12 n n ( 12 )
n−1
1− n − ≥ 0.05
13 13n
12 n−1
( 12 + n ) ≤ 0.95
13n
( 0.9231) ( 12 + n ) ≤ 11.4
n
For n = 3,( 0.9231) ( 12 + n ) = 11.7988 .
n
For n = 4,( 0.9231) ( 12 + n ) = 11.6175 .
n
For n = 5,( 0.9231) ( 12 + n ) = 11.3944 .
n
So, the minimum value of n is 5 .
Example 7: Consider the region bounded by the curves y = x 2 − 7 x + 6 and
y = 3 x + 6 . Then the probability that a point chosen in this region is below the line
y = 0 is
Solution:
The intersection of the curves can be calculated as
x 2 − 7 x + 6 = 3x + 6
x 2 − 10 x = 0
x ( x − 10 ) = 0
x = 0,10
Also, the curve y = x 2 − 7 x + 6 intersects the x − axis at
x2 − 7x + 6 = 0
( x − 1)( x − 6 ) = 0
x = 1,6
The area bounded by the curves can be split as
1
(
A1 = ∫ 3x+6 − x 2 − 7 x + 6 dx )
0
6
(
A2 = ∫ 0 − x 2 − 7 x + 6 dx )
1
6
A3 = ∫ 3 x + 6 − 0dx
1
10
∫ 3x+6 − ( x − 7 x + 6 dx )
2
A4 =
6
The regions can be calculated as
1)
1
x3 x 2
A1 = − + 10
3 2 0
1
= − +5
3
14
=
3
2)
6
x3 x2
A2 = − + 7 − 6x
3 2 1
1 7
= −72 + 126 − 36 + − + 6
3 2
19
= 24 −
6
125
=
6
3)
6
x2
A3 = 3 + 6x
2 1
3
= 54 + 36 − − 6
2
165
=
2
4)
10
x3 x 2
A4 = − + 10
3 2 6
1000
=− + 500 + 72 − 180
3
176
=
3
The probability can be calculated as
125 125
6 = 6
14 125 165 176 28 125 495 352
+ + + + + +
3 6 2 3 6 6 6 6
125
= 6
1000
6
1
=
8
Example 8: Choosing the numbers a, b from the natural numbers less than 115,
what is the probability that a randomly chosen number 3a + 3b is divisible by 10 ?
Solution:
The total number of possibilities for 3a + 3b is 114 × 114 .
The powers of 3 are 3,9,27,81,243,...
For a number to be divisible by 10, the unit digit must be 0.
The powers can be classified as 34 n−3 ,34 n−2 ,34 n−1 ,34 n .
And, 114 = 4 ( 29 ) − 2 .
For one value of 34 n−3 , we have 28 pairs of 34 n−1 . And, there are 29 values of 34 n−3 .
For one value of 34 n−2 , we have 28 pairs of 34 n . And, there are 29 values of 34 n .
The number of possibilities of sum that is divisible by 10 is
( 28 × 29) + ( 28 × 29) = 2 × 28 × 29
So, the probability is
2 × 28 × 29 406
=
114 × 114 57 × 57
406
=
3249
EXERCISE
k
2 , for x = 0,1,2,....
1. For a discrete random variable, if P ( X ==
x ) x + 4x + 3 .
0, otherwise
Find the value of k.
2. Let 11 = x1 > x2 > x3 > x4 > x5 be in an A.P. with a common difference d . If the
standard deviation of the terms is 3 2 , then what is the value of x 4 ?
15 15
∑( x − 15 ) F ∑( x − 15 ) F
2
3. If i 15 and i 255 , then what is the standard deviation
i F1 i F1
for the items ( x1 , x2 ,..., x15 ) ?
SOLUTIONS
1. Using the given data,
∞
k
∑x
x =0
2
+ 4x + 3
=1
The series sum can be calculated as
∞
k ∞
k
∑x
x ==
0
2
= ∑
+ 4 x + 3 x 0 ( x + 1)( x + 3 )
∞
k ( x + 3 ) − ( x + 1)
=∑
( x + 1)( x + 3 )
x =0 2
k ∞ 1 1
= ∑ −
2 x =0 x + 1 x + 3
k 1 1 1 1 1 1 1 1
= − + − + − + − + .....
2 1 3 2 4 3 5 4 6
Solving further, we have,
k 1
1+ = 1
2 2
4
k=
3
2. The given data can be rewritten as 11,11 + d ,11 + 2d ,11 + 3d ,11 + 4d .
Consider the terms 0, d ,2d ,3d ,4d .
The new mean is
0 + d + 2 d + 3d + 4 d
= 2d
5
Since variance and standard deviation are unchanged by the addition or
subtraction of common terms, we have
02 + d 2 + ( 2 d ) + ( 3d ) + ( 4 d )
2 2 2
( )
2
− ( 2d ) = 3 2
2
5
30d 2
− 4d 2 = 18
5
2d 2 = 18
d =3
Then, we have,
x4 = 11 + 3 ( 3 ) = 20
3. From the given data,
15
∑( x
i F1
i − 15 ) F
15
15
∑ ( x ) − 15 × 15 F
i F1
i 15
15
∑ ( x ) F 240
i F1
i
Then, the mean can be calculated as
240
xF
15
F 16
And,
15 15
∑ ( xi − x F
) ∑( x − 15 − 1)
2 2
i
Fi 1 Fi 1
15 15 15
∑( x i − 15 ) − 2 ∑ ( x i − 15 ) + ∑ ( 1 )
2 2
F
Fi 1 Fi 1 Fi 1
F255 − 2 ( 15 ) + 15
F 240
The standard deviation can be calculated as
15
∑( x − x)
2
i
σF i F1
15
240
F
15
F4