Discrete Random Variables
Dr. Linh Nghiem
MATH1905
Overview
• Having defined concepts of probabilities, we will next look at
random variables and these properties.
• Reading: Sections 5.1–5.6, 5.9 of Ross.
1
Random variables
• Suppose we flip a fair coin three times:
• Let X denote the number of heads that occurs.
• X is called a random variable.
2
Random variables
• Formally, a random variable (RV) assigns a numeric value to
each outcome in a sample space.
• Random variables are denoted by uppercase letters, such as
X, Y, Z, X1 , X2 , . . ..
• The actual observed (realized) value of a random variable is
denoted by a lowercase letter, such as x, y, z, x1 , x2 , . . ..
Eg:
3
Discrete vs continuous random variables
• A discrete random variable is one whose realized value can
take on countable number of possible values.
Number of heads in three coin flips, number of months whose
rainfall is greater than 100mm in a year, etc.
• A continuous random variable is one whose realized value can
take on uncountable number of possible values, i.e continuous
interval.
Height of a person, a stock’s return, etc.
4
Discrete probability distribution
Discrete probability distribution
• For a discrete random variable X, we can compute P(X = x)
as the sum of all the probabilities for the outcomes
corresponding to X = x.
Eg:
• A discrete probability distribution for X is a table listing
All the observed values X can take.
The corresponding probability for each value.
5
Discrete probability distribution
• P(X = x) = pX (x) is a function of the realized value x, and
pX (x) is called the probability mass function (pmf) of X.
• The pmf must satisfy both conditions:
0 ≤ pX (x) ≤ 1 for all x.
∑ pX (x) = 1.
x
6
Coin tossing examples
7
Expectation (expected value)
• The expectation (or expected value) of a random variable
X is the average (mean) value obtained by X.
• For a discrete RV X with pmf pX (x), its expectation is
µX = E(X ) = ∑ xpX (x),
x
and more generally, for any function g(X),
E[g(X)] = ∑ g ( x ) pX ( x ) .
x
• Example:
8
Variance and standard deviation
• The variance of a random variable X measures the variability
of X around its expectation.
• For a discrete RV X with pmf pX (x), its variance is
σX2 = V (X) = E[(X − µ)2 ] = ∑ ( x − µ ) 2 pX ( x )
x
• The standard deviation of X is the square root of the variance.
q
σX = σX2
• Example:
9
Laws of expectation and variance
For any random variable X and a constant (non-random) c,
• E(c) = c, V (c) = 0
• E(X + c) = E(X) + c, V (X + c) = V (X)
• E(cX) = cE(X), V (X) = c2 V (X).
10
Example
X = number of delivered pizzas to university students each month
x 0 1 2 3
pX ( x ) .1 .3, .4, .2
The pizzeria makes a profit of $3 per pizza, determine the mean
and variance of the profits.
11
Bivariate distribution
Bivariate distribution
• For two discrete random variables X and Y, the bivariate
distribution of X and Y lists
All the pair values (x, y) that X and Y can take.
The joint probability P({X = x} ∩ {Y = y}), denoted as
pX,Y (x, y) for the corresponding pairs.
• The joint pmf pX,Y (x, y) has to satisfy:
0 ≤ pX,Y (x, y) ≤ 1.
∑ ∑ pX,Y (x, y) = 1.
x y
12
Example
X, Y are numbers of houses the first and second real estate agent
sells in a month,
x
0 1 2
0 .12 .42 .06
y 1 .21 .06 .03
2 .07 .02 .01
13
Marginal and conditional probability
• To compute the marginal distribution of one variable, we sum
the probabilities with the other variable.
pX (x) = P(X = x) = ∑ pX,Y (x, y)
y
• Example:
14
Marginal and conditional probability
• The conditional distribution of Y given X = x (holding X
fixed at X = x), denoted as pY|X (y|x) is
P({X = x} ∩ {Y = y})
pY | X ( y | x ) = P ( Y = y | X = x ) =
P(X = x)
p (x, y)
= X,Y
pX ( x )
• Example:
15
Covariance and correlation
• The covariance between X and Y, denoted as σxy , is
σXY = Cov(X, Y) = E {(X − µX )(Y − µY ))}
= ∑ ∑(x − µX )(y − µY )pX,Y (x, y)
x y
• The correlation between X and Y, denoted as ρxy , is the
standardized covariance
σXY
ρXY = Cor(X, Y) =
σX σY
16
Example
x
0 1 2
0 .12 .42 .06
y 1 .21 .06 .03
2 .07 .02 .01
17
Laws of covariance and correlation
For any constants a, b, c we have
• Cov(X, c) = 0.
• Cov(X, X) = V(X).
• Cov(X, Y) = Cov(Y, X), ρXY = ρYX .
• Cov(aX, bY) = ab Cov(X, Y).
18
Laws of covariance and correlation
• −1 ≤ ρXY ≤ 1; ρXY = ±1 implies X and Y have a perfect
linear relationship.
ρXY = 1 implies Y = aX + b for some constant a > 0.
ρXY = −1 implies Y = aX + b for some constant a < 0.
• Cor(aX + b, cY + d) = Cor(X, Y), meaning correlation stays
the same under linear transformation.
• Real estate example: What is the correlation between the
profits of two agents, if the profit (in thousands dollars) of the
first and the second agents are given by 200X+70, and
180Y+100, respectively?
19
Independence and correlatedness
• X and Y are independent if and only if
pX,Y (x, y) = pX (x)pY (y).
for all (x, y) values.
• If X and Y are independent, then they are uncorrelated.
σXY = ρXY = 0.
• Note that the converse is not true.
20
Sum and linear combinations of
random variables
Sum of random variables
• If X and Y are discrete random variables, then Z = X + Y is
also a discrete random variable.
• The pmf of Z can be determined from the bivariate (joint)
distribution table of X and Y.
• Example:
21
Sum of random variables
While we can compute expectation and variance of Z from the
pZ (z), we can do it by the following simpler rules.
E(Z) = E(X + Y ) = E(X ) + E(Y )
V (Z) = V (X + Y) = V (X) + V (Y) + 2Cov(X, Y).
If X and Y are independent, then Cov(X, Y) = 0, so
V (X + Y ) = V (X ) + V (Y ).
22
Linear combination of random variables
• For two random variables X and Y, then Z = aX + bY is
called a linear combination of X and Y for any two constants
a and b.
• The expectation and variance of Z is given by
E(Z) = E(aX + bY) = E(aX) + E(bY) = aE(X) + bE(Y),
V (Z) = V (aX + bY) = V (aX) + V (bY) + 2Cov(aX, bY)
= a2 V (X) + b2 V (Y) + 2abCov(X, Y).
23
Linear combination of random variables
• In general, let X1 , . . . , Xn be n random variables, and
a1 , . . . , an be constants. Then
n
Z = a1 X1 + a2 X2 + . . . + an Xn = ∑ ai X i
i=1
is a linear combination of X1 , . . . , Xn .
• The expectation of variance of Z is given by
!
n n
E(Z) = E ∑ ai X i = ∑ ai E ( X i )
i=1 i=1
!
n n n n
V (Z) = V ∑ ai X i = ∑ a2i V (Xi ) + 2 ∑ ∑ ai aj Cov(Xi , Xj )
i=1 i=1 i=1 j>i
24
Linear combination of random variables
Finally, the covariance among linear combinations can be
decomposed into sum of covariance.
!
n m n m
Cov ∑ Xi , ∑ Yj = ∑ ∑ Cov(Xi , Yj )
i=1 j=1 i=1 j=1
25
Example
Stock A Stock B
Expected return 7% 15%
Standard deviation 12% 25%
Correlation of returns between two stocks are ρAB = 0.11. What is
expected return and standard deviation for the following two
portfolios?
1. Portfolio 1: 50% in A, 50% in B
2. Portfolio 2: 40% in A, 60% in B.
What is the correlation in the return of the two above portfolios?
26
Example
27
Binomial distribution
Bernoulli trial
A Bernoulli trial is a random experiment that:
• Only two categorical outcomes are possible, e.g
success/failure, yes/no, head/tail, etc. For now, we will call
the two outcomes success and failure.
• Let p denote the probability of success, then probability of
failure is 1 − p.
Let X denote the number of successes in a Bernoulli trial, then we
say X follows a Bernoulli distribution with probability p, and
write X ∼ Ber(p).
28
Bernoulli trial
We can find the distribution of X, compute its expectation and
variance.
29
Binomial distribution
Now instead of having only one Bernoulli trial, we have n
Bernoulli trials such that
• One trial is mutually independent from one another.
• Each trial has the same success probability p.
Let X be the number of successes out of n trials, then
X ∼ Bin(n, p).
30
Example
• Toss a fair coin 10 times and let X be the number of heads
• Inspecting randomly 50 products of a brand, each product
having probability of 5% to be defective. Let X denote the
number of defective products.
• A student decides to answer each of the 10 multiple choice
questions in a quiz completely randomly. Each question has
five options, only one of which is correct. Let X be the
number of correct answer.
31
Binomial distribution
• What values can X take?
• Probability mass function of X:
n
pX ( x ) = px (1 − p)n−x , x = 0, 1, 2, . . . , n
x
where
n n!
=
x x!(n − x)!
32
Compute binomial probabilities in R
X ∼ Bin(n = size, p = prob)
• dbinom(x, size, prob): gives the pmf at X = x
• pbinom(q, size, prob): give the quantity P(X ≤ q),
which is called the cumulative probability.
• qbinom(p, size, prob): give the maximum value of x such
that P(X ≤ x) ≤ p.
Example in R code.
33
Shape of binomial distribution
X ∼ Bin(20, p)
p = 0.2 p = 0.5 p = 0.7
0.20
0.20
0.15
0.15
0.15
0.10
pX(x)
pX(x)
pX(x)
0.10
0.10
0.05
0.05
0.05
0.00 0.00 0.00
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
x x x
34
Binomial distribution
While one can compute the expectation and variance of X from
the pmf directly, we can do it more easily by viewing the
binomial distribution as sum of Bernoulli random variables.
• Let Xi be the number of successes in the ith trial,
i = 1, . . . , n. Then Xi ∼ Ber(p).
• The number of successes out of n trials is X = ∑ni=1 Xi .
• We can apply the law of expectation and variance to obtain:
!
n n n
E(X ) = E ∑ Xi = ∑ E(Xi ) = ∑ p = np,
i=1 i=1 i=1
!
n n n
V (X ) = V ∑ Xi = ∑ V (Xi ) = ∑ p(1 − p) = np(1 − p).
i=1 i=1 i=1
35