MATH F113: Probability & Statistics
Anupama Sharma
Department of Mathematics,
BITS PILANI K K Birla Goa Campus, Goa
Semester I, 2023-24
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Module 2: Discrete Distributions
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Definition: Let X be a random variable with pmf f and expected
value µ. The variance of X , denoted by Var [X ] or 2 , is given by
Var [X ] = E [(X µ)2 ].
Definition: Let X be a random variable with variance 2. The
standard deviation of X , denoted by , is given by
p p
= 2 = Var [X ].
Note that the variance measure variability by considering (X µ),
the di↵erence between r.v. and its mean. The di↵erence is squared
so that negative values will not cancel the positive ones.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Theorem: Computational formula for 2
2
= E [X 2 ] (E [X ])2 .
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Theorem: Computational formula for 2
2
= E [X 2 ] (E [X ])2 .
Proof: By definition,
2
= Var [X ] = E [(X µ)2 ]
2
) = E [(X µ)2 ]
= E [(X 2 2 · µ · X + µ2 )]
= E [X 2 ] 2 · µ · E [X ] + µ2
= E [X 2 ] 2 · µ · µ + µ2 [Since E [X ] = µ]
2 2
= E [X ] µ
2
= E [X ] (E [X ])2
This completes the proof.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Rules of Variance:
Let X and Y be random variables and c be any real number. Then,
(i) Var c = 0.
(ii) Var cX = c 2 Var X .
(iii) If X and Y are independent, then
Var (X + Y ) = Var X + Var Y .
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Rules of Variance:
Let X and Y be random variables and c be any real number. Then,
(i) Var c = 0.
(ii) Var cX = c 2 Var X .
(iii) If X and Y are independent, then
Var (X + Y ) = Var X + Var Y .
Proposition: If H(x) = aX + b is linear function, then
V [aX + b] = a2 V [X ]
where, a and b are constants.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Definition: Let X be a random variable. The k-th ordinary
moment for X is defined as E [X k ] where k is a natural number.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Definition: Let X be a random variable. The k-th ordinary
moment for X is defined as E [X k ] where k is a natural number.
Moment generating function – is a generating function that
encodes the moments of a distribution.
Definition: Let X be a random variable with p.m.f. f . The
moment generating function for X (mgf), denoted by mX (t), is
given by X
mX (t) = E [e tX ] = e tx f (x)
all x
provided this expectation is finite for all real numbers t in some
open interval containing zero.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
What it means if mgf exists for a r.v. X?
– Mean of X can be found by evaluating first derivative of mgf
at t = 0, i.e.
µ = E [X ] = mX 0 (0)
– Variance of X can be found by evaluating the first and
second derivatives of mgf at t = 0, i.e.
= E [X 2 ] (E [X ])2 = mX 00 (0) (mX 0 (0))2
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Theorem: Let mX (t) be the moment generating function for a
random variable X . Then
d k mX (t)
= E [X k ].
dt k t=0
Proof: Let us consider that z = tX . The Maclaurian series
expansion for e tX gives,
(tX )2 (tX )3
e tX = 1 + tX + + + ···
2! 3!
By taking the expected value of both sides in the above equation,
we get
(tX )2 (tX )3
E [e tX ] = E [1 + tX + + + ···]
2! 3!
t 2 E [X 2 ] t 3 E [X 3 ]
mX (t) = 1 + tE [X ] + + + ···
2! 3!
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Di↵erentiate the above equation w.r.t. t, we get
dmX (t) t 2 E [X 3 ]
= E [X ] + tE [X 2 ] + + ···
dt 2!
Note that, when this derivative is evaluated at t = 0, every term
except the first term becomes 0. Hence,
dmX (t)
= E [X ].
dt t=0
Taking the second derivative and evaluating at t = 0, gives,
d 2 mX (t)
= E [X 2 ].
dt 2 t=0
If we continue this, we get
dmX (t)
= E [X k ].
dt k t=0
This completes the proof.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Some points to remember about mgf
1. Using moment generating function is a very powerful tool but
unfortunately mgf does not always exist so we can not use it every
time even if we are dealing with independent variables.
2. Uniqueness: The moment generating function of a
distribution, if it exists, uniquely determine the distribution. This
implies that corresponding to a given probability distribution, there
is one and only one mgf (provided it exists) and corresponding to a
given mgf, there is only one probability distribution.
3. The variable t is not related to X, it is a dummy variable.
4. mX (0) = 1 for any valid mgf. So whenever you compute an
mgf, plug in 0 and check if you get 1.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Theorem: Let X be a random variable with moment generating
function mX (t). For any constants a and b, the moment
generating function of Y = aX + b is
mY (t) = e bt mX (at).
Theorem: Let X1 and X2 be independent random variables with
moment generating functions mX1 (t) and mX2 (t) respectively. Let
Y = X1 + X2 . The moment generating of Y is
mY (t) = mX1 (t)mX2 (t).
Theorem: Let X and Y be random variables with moment
generating functions mX (t) and mY (t) respectively. If
mX (t) = mY (t) for all t in some open interval about 0, then X
and Y have same distribution.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Uniform random variable
Uniform distribution
– the most fundamental of all is the discrete uniform
distribution.
– symmetric distribution where a finite number of values are
“equally likely” to be observed, i.e. every one of the n events
has equal probability 1/n.
– for example, possible outcomes of rolling a fair die; drawing
a spade, a heart, a club, or a diamond from a deck of card
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Uniform random variable
Uniform distribution
– the most fundamental of all is the discrete uniform
distribution.
– symmetric distribution where a finite number of values are
“equally likely” to be observed, i.e. every one of the n events
has equal probability 1/n.
– for example, possible outcomes of rolling a fair die; drawing
a spade, a heart, a club, or a diamond from a deck of card
Uniform random variable: A random variable X has a discrete
uniform distribution if each of n values in its range has equal
probability. Then
f (x) = P(X = x) = 1/n, x = 1, 2, ..., n,
where, f (x) represents the pmf of X. We write X ⇠ Unif (p).
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Uniform random variable
n
X
Mean, E [X ] = xf (x) = (n + 1)/2.
x=1
n
X (n + 1)(2n + 1)
We have, E [X 2 ] = x 2 f (x) =
6
x=1
Variance, V [X ] = 2 = E [X 2 ] (E [X ])2 = (n2 1)/12.
Moment generation function,
( t
e e (n+1)t
n(1 e t ) , t 6= 0
mX (t) =
1, t = 0.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Bernoulli random variable
A Bernoulli trial is a random experiment in which there are only
two possible outcomes – success and failure.
Examples
1. Tossing a coin and considering heads as success and tails as
failure.
2. Checking items from a production line: success = not defective,
failure = defective.
3. Phoning a call centre: success = operator free; failure = no
operator free.
4. Simulating the spread of an epidemic: success =disease
transmission; failure = no disease transmission.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Bernoulli random variable
A r.v. X is said to be a Bernoulli random variable if its probability
mass function is given by
f (1) = P(X = 1) = p
f (0) = P(X = 0) = 1 p.
where p, 0 p 1, is the probability that the trail is a success.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Bernoulli random variable
A r.v. X is said to be a Bernoulli random variable if its probability
mass function is given by
f (1) = P(X = 1) = p
f (0) = P(X = 0) = 1 p.
where p, 0 p 1, is the probability that the trail is a success.
Mean, E [X ] = p,
Variance, V [X ] = p(1 p),
Moment generation function, mx (t) = (1 p) + e t p.
We write
X ⇠ Bern(p).
The number p is called the parameter of the distribution.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
Consider n independent trials, each of which result in a success
with probability p or in failure with probability (1 p) are
performed.
The trials are identical and independent, and probability p of
success remains same from trial to trial.
The random variable X represents the number of successes that
occur in these n trails, then X is said to be a Binomial random
variable and the distribution of X is called the Binomial
distribution with parameters n and p. We write X ⇠ Bin(n, p).
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
Consider n independent trials, each of which result in a success
with probability p or in failure with probability (1 p) are
performed.
The trials are identical and independent, and probability p of
success remains same from trial to trial.
The random variable X represents the number of successes that
occur in these n trails, then X is said to be a Binomial random
variable and the distribution of X is called the Binomial
distribution with parameters n and p. We write X ⇠ Bin(n, p).
Note that, Bernoulli distribution is a special case of Binomial
distribution with parameter (1,p)
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Examples
Consider the experiment where one ball is drawn without
replacement from a box containing 20 red and 30 blue balls, and
the number of red balls drawn is recorded. Is this a binomial
experiment?
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Examples
Consider the experiment where one ball is drawn without
replacement from a box containing 20 red and 30 blue balls, and
the number of red balls drawn is recorded. Is this a binomial
experiment?
No! The key here is the lack of independence - since the balls are
drawn without replacement, the ball drawn on the first will a↵ect
the probability of later balls.
A fair die is rolled ten times, and the number of 6’s is recorded.
Is this a binomial experiment?
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Examples
Consider the experiment where one ball is drawn without
replacement from a box containing 20 red and 30 blue balls, and
the number of red balls drawn is recorded. Is this a binomial
experiment?
No! The key here is the lack of independence - since the balls are
drawn without replacement, the ball drawn on the first will a↵ect
the probability of later balls.
A fair die is rolled ten times, and the number of 6’s is recorded.
Is this a binomial experiment?
Yes! There are fixed number of trials (ten rolls), each roll is
independent of the others, there are only two outcomes (either
it’s a 6 or it isn’t), and the probability of rolling a 6 is constant.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
Theorem: Let a random variable X has a Binomial distribution
with parameters n and p. Then its probability mass function is
given by
⇢ n x
p (1 p)n x x = 0, 1, . . . , n
b(x; n, p) = x
0 otherwise
where 0 < p < 1 and n is a positive integer.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
Example: Five fair coins are tossed. Assuming the outcomes
independent, find the pmf of number of heads obtained.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
For X ⇠ Bin(n, p), the cdf will be denoted by
x
X
B(x; n, p) = P(X x) = b(y ; n, p) x = 0, 1, . . . , n
y =0
Going back to the previous example,
Find the probability of at least two heads.
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa
Binomial random variable
Example: Bits are sent over a communications channel in packets
of 12. If the probability of a bit being corrupted over this channel
is 0.1 and such errors are independent, what is the probability that
no more than 2 bits in a packet are corrupted? If 6 packets are
sent over the channel, what is the probability that at least one
packet will contain 3 or more corrupted bits?
MATH F113 (Probability & Statistics) Anupama Sharma, BITS Pilani, Goa