Slide 2023.2 MI2036 Chap3
Slide 2023.2 MI2036 Chap3
HANOI – 2024
(1)
Email: [email protected]
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 1/148
HANOI – 2024 1 / 148
CHAPTER OUTLINE
Chapter outline
1 TWO RANDOM VARIABLES
2 COVARIANCE AND CORRELATION
3 BIVARIATE NORMAL DISTRIBUTION
4 CENTER LIMIT THEOREM
✍ After careful study of this chapter you should be able to do the following:
1 Use joint probability mass functions and joint probability density functions to
calculate probabilities
2 Calculate marginal and conditional probability distributions from joint probability
distributions
3 Interpret and calculate covariances and correlations between random variables
4 Understand properties of a bivariate normal distribution and be able to draw
contour plots for the probability density function
5 Calculate means and variances for linear combinations of random variables and
calculate probabilities for linear combinations of normally distributed random
variables
6 Understand the central limit theorem
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 3/148
HANOI – 2024 3 / 148
INTRODUCTION
✍ This chapter analyzes experiments that produce two random variables, X and Y .
1 FX,Y (x, y), the joint cumulative distribution function of two random variables.
2 pX,Y (x, y), the joint probability mass function for two discrete random variables.
3 fX,Y (x, y), the joint probability density function of two continuous random variables.
4 Functions of two random variables.
5 Conditional probability and independence.
CONTENT
1 3.1 TWO RANDOM VARIABLES
3.1.1 Joint Probability Distributions
3.1.2 Marginal Probability Distributions
3.1.3 Conditional Probability Distributions
3.1.4 Independence
Exercises for Section 3.1
2 3.2 COVARIANCE AND CORRELATION
3.2.1 Covariance. Covariance Matrix
3.2.2 Correlation Coefficient
Exercises for Section 3.2
3 3.3 BIVARIATE NORMAL DISTRIBUTION
3.3.1 Joint Probability Distributions
3.3.2 Marginal Probability Distributions
3.3.3 Conditional Probability Distributions
3.3.4 Correlation
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 5/148
HANOI – 2024 5 / 148
3.1 TWO RANDOM VARIABLES 3.1.1 Joint Probability Distributions
■ If X and Y are discrete random variables, the joint probability distribution of X and
Y is a description of the set of points (x, y) in the range of (X, Y ) along with the
probability of each point.
■ The joint probability distribution of two random variables is sometimes referred to as
the bivariate probability distribution or bivariate distribution of the random variables.
■ One way to describe the joint probability distribution of two discrete random
variables is through a joint probability mass function.
■ Also, P (X = x and Y = y) is usually written as P (X = x, Y = y).
where
Pm P pij = P (X = xi , Y = yj ) for all i = 1, . . . , m, j = 1, . . . , n. Note that,
n
i=1 j=1 pij = 1.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 8/148
HANOI – 2024 8 / 148
3.1 TWO RANDOM VARIABLES 3.1.1 Joint Probability Distributions
Theorem 1
For discrete random variables X and Y and any set B in the (X, Y ) plane, the
probability of the event {(X, Y ) ∈ B} is
X
P (B) = pX,Y (x, y) (1)
(x,y)∈B
✍ There are various ways to represent a joint PMF. We use three of them in the
following example: a list, a matrix, and a graph.
Example 1
Three ballpoint pens are selected at random from a box that contains 5 blue pens, 4
red pens, and 3 green pens. If X is the number of blue pens selected and Y is the
number of red pens selected, find
(a) The joint probability function PX,Y (x, y).
(b) P [(X, Y ) ∈ A], where A is the region {(x, y) | x + y ≤ 1}.
Example 2
Test two integrated circuits one after the other. On each test, the possible outcomes
are A (accept) and R (reject). Assume that all circuits are acceptable with
probability 0.9 and that the outcomes of successive tests are independent. Count the
number of acceptable circuits X and count the number of successful tests Y before
you observe the first reject. (If both tests are successful, let Y = 2.)
(a) Find the joint PMF of X and Y .
(b) Find the probability of the event B that X, the number of acceptable circuits,
equals Y , the number of tests before observing the first failure.
Solution.
(a) The sample space of the experiment is SX,Y = {AA, AR, RA, RR}. We compute
g(AA) = (2, 2), g(AR) = (1, 1), g(RA) = (1, 0), g(RR) = (0, 0).
■ For each pair of values (x, y), pX,Y (x, y) is the sum of the probabilities of the
outcomes for which X = x and Y = y. For example, pX,Y (1, 1) = P (AR).
P (B) = pX,Y (0, 0) + pX,Y (1, 1) + pX,Y (2, 2) = 0.01 + 0.09 + 0.81 = 0.91.
pX,Y (x, y) y = 0 y = 1 y = 2
x=0 0.01 0 0
x=1 0.09 0.09 0
x=2 0 0 0.81
Theorem 2
For any pair of random variables, X and Y ,
1 0 ≤ FX,Y (x, y) ≤ 1.
2 FX,Y (−∞, y) = FX,Y (x, −∞) = 0.
3 If x < x1 and y < y1 , then FX,Y (x, y) ≤ FX,Y (x1 , y1 ).
4 FX,Y (+∞, +∞) = 1.
5 If x1 < x2 , y1 < y2 , then
Zx Zy
FX,Y (x, y) = fX,Y (u, v)dvdu (4)
−∞ −∞
✍ Given FX,Y (x, y), Definition 3 implies that fX,Y (x, y) is a derivative of the CDF.
Theorem 3
∂ 2 FX,Y (x, y)
fX,Y (x, y) = .
∂x∂y
✍ Properties (c) and (d) of Theorem 2 for the CDF FX,Y (x, y) imply corresponding
properties for the PDF.
Theorem 4
A joint PDF fX,Y (x, y) has the following properties:
1 fX,Y (x, y) ≥ 0 for all (x, y).
Z +∞ Z +∞
2 fX,Y (x, y)dxdy = 1.
−∞ −∞
Theorem 5
For any region A of two-dimensional space,
ZZ
P [(X, Y ) ∈ A] = fX,Y (x, y)dxdy.
A
Example 3
Random variables X and Y have joint PDF
(
k, 0 ≤ x ≤ 6, 0 ≤ y ≤ 3,
fX,Y (x, y) =
0, otherwise.
(b) The small cyan rectangle in the diagram is the event A = (2 ≤ X < 4, 2 ≤ Y < 3).
P (A) is the integral of the PDF over this rectangle, which is
✍ We perform an experiment and observe sample values of two random variables X and
Y . Based on our knowledge of the experiment, we have a probability model for X and Y
embodied in the joint PMF pX,Y (x, y) or a joint PDF fX,Y (x, y). After performing the
experiment, we calculate a sample value of the random variable W = g(X, Y ). The
mathematical problem is to derive a probability model for W .
■ When X and Y are discrete random variables, S
W , the range of W , is a countable
set corresponding to all possible values of g(X, Y ). Therefore, W is a discrete
random variable and has a PMF PW (w).
■ When X and Y are continuous random variables and g(x, y) is a continuous
Example 4
In Example 3, let W = max{X, Y }. Find the PDF fW (w) of W .
Solution.
■ We have
1
ZZ
FW (w) = P [max(X, Y ) < w] = P (X < w, Y < w) = dxdy,
18
A
■ If w ≤ 0, FW (w) = 0.
Z w Z w
1 w2
■ If 0 < w ≤ 3, FW (w) = dx 18
dy = 18
.
Z0 w Z0 3
1 w
■ If 3 < w ≤ 6, FW (w) = dx 18
dy = 6
.
0 0
■ If w > 6, FW (w) = 1.
y y
3 3
w
0 x 0 x
w 3 6 3 w 6
(a) (b)
■ Hence,
0 if w ≤ 0,
w2
18
if 0 < w ≤ 3,
FW (w) = w
if 3 < w ≤ 6,
6
1 if w > 6.
■ Therefore,
w
9 if 0 < w ≤ 3,
1
fW (w) = if 3 < w ≤ 6,
6
0 if w ≤ 0, w > 6.
X X
E(W ) = E[g(X, Y )] = g(x, y)pX,Y (x, y) (W : discrete) (5)
x∈SX y∈SY
Z∞ Z∞
E(W ) = E[g(X, Y )] = g(x, y)fX,Y (x, y)dxdy (W : continuous). (6)
−∞ −∞
Example 5
For X and Y in Example 1, using (5),
Example 6
For X and Y in Example 3, using (6),
Z6 Z3 Z6
1 1 36
E(XY ) = xy dy dx = xdx = = 4.5.
18 4 8
0 0 0
Theorem 7
Theorem 8
For any two random variables X and Y , E(X + Y ) = E(X) + E(Y ).
Theorem 9
The variance of the sum of two random variables is
Theorem 10
For discrete random variables X and Y with joint PMF pX,Y (x, y),
X X
pX (x) = pX,Y (x, y), pY (y) = pX,Y (x, y) (7)
y∈SY x∈SX
✍
(a) Marginal probability distribution of X is
X x1 x2 ... xm
P (X = xi ) P (X = x1 ) P (X = x2 ) . . . P (X = xm )
(b) Marginal probability distribution of Y is
Y y1 y2 ... yn
P (Y = yj ) P (Y = y1 ) P (Y = y2 ) . . . P (Y = yn )
Example 7
In Example 2, we found the joint PMF of X and Y to be
Solution.
■ To find p (x), we note that both X and Y have range {0, 1, 2}. Theorem 1 gives
X
2
X 2
X
pX (0) = pX,Y (0, y) = 0.01, pX (1) = pX,Y (1, y) = 0.18
y=0 y=0
2
X
pX (2) = pX,Y (2, y) = 0.81, pX (x) = 0, x ̸= 0, 1, 2.
y=0
Example 8
Referring to the matrix representation of pX,Y (x, y) in Example 2, we observe that
each value of pX (x) is the result of adding all the entries in one row of the matrix.
Each value of pY (y) is a column sum. We display pX (x) and pY (y) by rewriting the
matrix in Example 2 and placing the row sums and column sums in the margins.
pX,Y (x, y) y = 0 y=1 y=2 y=3 pX (x)
x=0 1/220 3/55 9/110 1/55 35/220
x=1 3/44 3/11 3/22 0 105/220
x=2 3/22 2/11 0 0 70/220
x=3 1/22 0 0 0 10/220
pY (y) 56/220 112/220 48/220 4/220 1
Note that the sum of all the entries in the bottom margin is 1 and so is the sum of all
the entries in the right margin. The complete marginal PMF, pY (y), appears in the
bottom row of the table, and pX (x) appears in the last column of the table. From the
matrix of pX,Y (x, y), we have marginal probability distribution of X is
X 0 1 2 3
P (X = xi ) 7/44 21/44 7/22 1/22
Y 0 1 2 3
P (Y = yj ) 14/55 28/55 12/55 1/55
Remark 1
The expected value E(X) and variance Var(X) can be obtained by first calculating
the marginal probability distribution of X and then determining E(X) and Var(X)
by the usual method.
Example 9
Find the expected value E(X) and variance Var(X) of random variable X in
Example 2.
Solution. Using the result of Example 7,
E(X) = (0)(0.01) + (1)(0.18) + (2)(0.81) = 1.8,
Var(X) = (02 )(0.01) + (12 )(0.18) + (22 )(0.81) − (1.8)2 = 0.18.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 41/148
HANOI – 2024 41 / 148
3.1 TWO RANDOM VARIABLES 3.1.2 Marginal Probability Distributions
The marginal CDF has properties that are direct consequences of the definition. For
example, we note that the event (X < x) suggests that Y can have any value so long as
the condition on X is met. This corresponds to the joint event (X < x, Y < ∞).
Therefore,
FX (x) = P (X < x) = P (X < x, Y < ∞) = lim FX,Y (x, y) = FX,Y (x, ∞) (8)
y→∞
Theorem 11
If the joint probability density function of random variables X and Y is fX,Y (x, y),
the marginal probability density functions of X and Y are
Z+∞ Z+∞
fX (x) = fX,Y (x, y)dy, fY (y) = fX,Y (x, y)dx (10)
−∞ −∞
Example 10
The joint PDF of X and Y is
5y
, −1 ≤ x ≤ 1, x2 ≤ y ≤ 1
fX,Y (x, y) = 4
0, otherwise.
−1
The complete
O 1 x
expression for the marginal PDF of X is
5(1 − x4 )
, −1 ≤ x ≤ 1,
fX (x) = 8
0, otherwise.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 45/148
HANOI – 2024 45 / 148
3.1 TWO RANDOM VARIABLES 3.1.2 Marginal Probability Distributions
Example 11
The joint PDF of X and Y is
(
kx, if 0 < y < x < 1,
fX,Y (x, y) =
0, otherwise.
(c) We have
Z1
3
E(X) = x × 3x2 dx = and Var(X) = E(X 2 ) − [E(X)]2
4
0
Z1 3 2
2 2 3
= x × 3x dx − = .
4 80
0
✍
■ When two random variables are defined in a random experiment, knowledge of one
can change the probabilities that we associate with the values of the other.
■ From the notation for conditional probability in Chapter 1, we can write conditional
probabilities such as P (Y = 1|X = 3) or P (Y = 3|X < 5).
■ Consequently, the random variables X and Y are expected to be dependent.
Knowledge of the value obtained for X changes the probabilities associated with the
values of Y .
■ Recall that the definition of conditional probability for events A and B is
P (B|A) = P P(A∩B)
(A)
. This definition can be applied with event A defined to be
X = x and event B defined to be Y = y.
Theorem 12
For any event B, a region of the X, Y plane with P (B) > 0,
pX,Y (x, y)
(x, y) ∈ B,
pX,Y |B (x, y) = P (B) (11)
0, otherwise
fX,Y (x, y) ,
(x, y) ∈ B,
fX,Y |B (x, y) = P (B) (12)
0, otherwise.
Example 12
X and Y are random variables with joint PDF
1
, 0 ≤ x ≤ 5, 0 ≤ y ≤ 3,
fX,Y (x, y) = 15
0, otherwise.
Solution.
■ We calculate P (B) by integrating f
X,Y (x, y) over the region B.
Z3 Z5 Z3
1 1 1
P (B) = dxdy = (1 + y)dy = .
15 15 2
0 4−y 0
Z+∞ Z+∞
E(W |B) = g(x, y)fX,Y |B (x, y)dxdy (14)
−∞ −∞
Example 13
Continuing Example 12, find the conditional expected value of W = XY given the
event B = {X + Y ≥ 4}.
Theorem 14
where
P (X = x, Y = y)
P [(X = x)|(Y = y)] = .
P (Y = y)
✍ The following theorem contains the relationship between the joint PMF of X and Y
and the two conditional PMFs, pX|y (x) and pY |x (y).
Theorem 15
For random variables X and Y with joint PMF pX,Y (x, y), and x and y such that
pX (x) > 0 and pY (y) > 0,
pX,Y (x, y) = pX|(Y =y) (x)pY (y) = pY |(X=x) (y)pX (x) (16)
Example 14
In Example 1,
P (X = 0, Y = 1) 3
p(X = 0|Y = 1) = = ,
P (Y = 1) 28
P (X = 1, Y = 1) 15
p(X = 1|Y = 1) = = ,
P (Y = 1) 28
P (X = 2, Y = 1) 10
p(X = 2|Y = 1) = = ,
P (Y = 1) 28
P (X = 3, Y = 1)
p(X = 3|Y = 1) = = 0.
P (Y = 1)
X 0 1 2 3
P (X|Y = 1) 3/28 15/28 5/14 0
Example 15
Returning to Example 14,
3 15 5
E(X|(Y = 1)) = 0 × +1× +2× + 3 × 0 = 1.25.
28 28 14
✍ Now we consider the case in which X and Y are continuous random variables. We
observe (Y = y) and define the PDF of X given (Y = y). We cannot use B = (Y = y)
in Definition 4 because P (Y = y) = 0. Instead, we define a conditional probability
density function, denoted as fX|y (x) : fX|(Y =y) (x).
Definition 8 (Conditional PDF)
For y such that fY (y) > 0, the conditional PDF of X given (Y = y) is
fX,Y (x, y)
fX|y (x) := fX|(Y =y) (x) = . (19)
fY (y)
Remark 2
Because the conditional probability density function fY |(X=x) (y) is a probability
density function for all y, the following properties are satisfied:
(a) fY |x (y) ≥ 0.
Z +∞
(b) fY |x (y)dy = 1.
−∞
Z
(c) P (Y ∈ B|(X = x)) = fY |x (y)dy for any set B in the range of Y .
B
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 67/148
HANOI – 2024 67 / 148
3.1 TWO RANDOM VARIABLES 3.1.3 Conditional Probability Distributions
Example 16
The joint probability density function of random variables X and Y is
(
2, 0 < y < x < 1,
fX,Y (x, y) =
0, otherwise.
For 0 < x < 1, find the conditional PDF fY |x (y). For 0 < y < 1, find the conditional
PDF fX|y (x).
Solution.
■ For 0 < x < 1, Theorem 6 implies
Z+∞ Zx
fX (x) = fX,Y (x, y)dy = 2dy = 2x.
−∞ 0
Example 17
The joint probability density function of random variables X and Y is
(
1
, if 0 < y < x < 1,
fX,Y (x, y) = x
0, otherwise.
(a) Find the PDFs fX (x), fY (y). (b) Find the conditional PDFs fX|y (x), fY |x (y).
Solution.
(a) The PDFs of X and Y are:
Z x
Z+∞ 1
(
x
dy, 0 < x < 1, 1, 0 < x < 1,
fX (x) = f (x, y)dy = 0 =
0, otherwise, 0, otherwise.
−∞
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 71/148
HANOI – 2024 71 / 148
3.1 TWO RANDOM VARIABLES 3.1.3 Conditional Probability Distributions
Z 1
Z+∞ 1
(
x
dx, 0 < y < 1, − ln y, 0 < y < 1,
fY (y) = f (x, y)dx = y =
0, otherwise.
0, otherwise,
−∞
Theorem 17
Z+∞
E[g(X, Y )|Y = y] = g(x, y)fX|y (x)dx. (21)
−∞
Z+∞
E(X|Y = y) = xfX|y (x)dx. (22)
−∞
Example 18
For random variables X and Y in Example 3, we found in Example 16 that the
conditional PDF of X given Y is
(
fX,Y (x, y) 1/(1 − y), y < x < 1,
fX|y (x) = =
fY (y) 0, otherwise.
Solution.
■ Given the conditional PDF f
X|y (x), we perform the integration
Z+∞
E[X|Y = y] = xfX|y (x)dx
−∞
Z1
1 x2 x=1 1+y
= xdx = = .
1−y 2(1 − y) x=y 2
y
Independence
Independence
✍ Theorem 15 implies that if X and Y are independent discrete random variables, then
Independence
Example 19
Random variables X and Y have joint PDF
(
4xy, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
fX,Y (x, y) =
0, otherwise.
Independence
Solution.
■ The marginal PDFs of X and Y are
( (
2x, 0 ≤ x ≤ 1, 2y, 0 ≤ y ≤ 1,
fX (x) = , fY (y) =
0, otherwise 0, otherwise.
■ It is easily verified that fX,Y (x, y) = fX (x)fY (y) for all pairs (x, y) and so we
conclude that X and Y are independent.
Independence
Theorem 18
For independent random variables X and Y ,
■ E[g(X)h(Y )] = E[g(X)]E[h(Y )],
■ E(XY ) = E(X)E(Y ),
X and Y , respectively.
Independence
Example 20
Let the random variables X and Y denote the lengths of two dimensions of a
machined part, respectively. Assume that X and Y are independent random
variables, and further assume that the distribution of X is normal with mean 10.5
millimeters and variance 0.0025 (millimeter)2 and that the distribution of Y is
normal with mean 3.2 millimeters and variance 0.0036 (millimeter)2 . Determine the
probability that 10.4 < X < 10.6 and 3.15 < Y < 3.25.
Independence
P (10.4 < X < 10.6, 3.15 < Y < 3.25) = P (10.4 < X < 10.6)P (3.15 < Y < 3.25)
h 10.6 − 10.5 10.4 − 10.5 i h 3.25 − 3.2 3.15 − 3.2 i
= Φ −Φ × Φ −Φ
0.05 0.05 0.06 0.06
= Φ(2) − Φ(−2) Φ(0.83) − Φ(−0.83)
= (2 × 0.97725 − 1)(2 × 0.79673 − 1) = 0.5665.
Exercise 1
Random variables X and Y have the joint PMF
(
cxy, x = 1, 2, 4, y = 1, 3,
PX,Y (x, y) =
0, otherwise.
Exercise 2
Random variables X and Y have the joint PMF
(
c|x + y|, x = −2, 0, 2, y = −1, 0, 1,
PX,Y (x, y) =
0, otherwise.
Exercise 3
Given the random variables X and Y in Problem 1, find
(a) The marginal PMFs pX (x) and pY (y).
(b) The expected values E(X) and E(Y ). Sol. 3 and 5/2.
(c) The standard deviations σX and σY .
Exercise 4
Given the random variables X and Y in Problem 2, find
(a) The marginal PMFs pX (x) and pY (y),
(b) The expected values E(X) and E(Y ),
(c) The standard deviations σX and σY .
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 87/148
HANOI – 2024 87 / 148
3.1 TWO RANDOM VARIABLES Exercises for Section 3.1
Exercise 5
Random variables X and Y have the joint probability distribution
H
HH Y
1 2 3
X H
HH
1 0.12 0.15 0.03
2 0.28 0.35 0.07
Exercise 6
Given random variables X and Y in Problem 2 and the function W = X + 2Y , find
(a) The probability mass function PW (w).
(b) The expected value E(W ). Sol. 0.
(c) P (W > 0). Sol. 3/7.
Exercise 7
Random variables X and Y have the joint PDF
(
c, x + y ≤ 1, x ≥ 0, y ≥ 0,
fX,Y (x, y) =
0, otherwise.
Exercise 8
Random variables X and Y have joint PDF
(
cxy 2 , 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
fX,Y (x, y) =
0, otherwise
Exercise 9
X and Y are random variables with the joint PDF
(
2, x + y ≤ 1, x ≥ 0, y ≥ 0,
fX,Y (x, y) =
0, otherwise.
Exercise 10
Random variables X and Y have the joint PDF
(
cy, 0 ≤ y ≤ x ≤ 1,
fX,Y (x, y) =
0, otherwise.
Exercise 11
Random variables X and Y have joint PDF
(
x + y, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
fX,Y (x, y) =
0, otherwise.
Let W = max(X, Y ).
(a) What is SW , the range of W ?
(b) Find FW (w) and fW (w).
Exercise 12
Random variables X and Y have joint PDF
(
5x2 /2, −1 ≤ x ≤ 1, 0 ≤ y ≤ x2 ,
fX,Y (x, y) =
0, otherwise.
Let A = {Y ≤ 1/4}.
(a) What is the conditional PDF fX,Y |A (x, y)?
(b) What is fY |A (y)?
(c) What is E(Y |A)?
(d) What is fX|A (x)?
(e) What is E(X|A)?
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 95/148
HANOI – 2024 95 / 148
3.1 TWO RANDOM VARIABLES Exercises for Section 3.1
Exercise 13
X and Y have joint PDF
(
(4x + 2y)/3, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1,
fX,Y (x, y) =
0, otherwise.
(a) For which values of y is fX|Y =y (x) defined? What is fX|Y =y (x)?
(b) For which values of x is fY |X=x (y) defined? What is fY |X=x (y)?
CONTENT
1 3.1 TWO RANDOM VARIABLES
3.1.1 Joint Probability Distributions
3.1.2 Marginal Probability Distributions
3.1.3 Conditional Probability Distributions
3.1.4 Independence
Exercises for Section 3.1
2 3.2 COVARIANCE AND CORRELATION
3.2.1 Covariance. Covariance Matrix
3.2.2 Correlation Coefficient
Exercises for Section 3.2
3 3.3 BIVARIATE NORMAL DISTRIBUTION
3.3.1 Joint Probability Distributions
3.3.2 Marginal Probability Distributions
3.3.3 Conditional Probability Distributions
3.3.4 Correlation
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 97/148
HANOI – 2024 97 / 148
3.2 COVARIANCE AND CORRELATION 3.2.1 Covariance. Covariance Matrix
(a) Covariance
Definition 12 (Covariance)
The covariance of two random variables X and Y is
(a) Covariance
✍ If the points in the joint probability distribution of X and Y that receive positive
probability tend to fall along a line of positive (or negative) slope, Cov(X, Y ) is positive
(or negative).
y y
x x
(a) (b)
Figure 3: Joint probability distributions and the sign of covariance between X and Y (a) Positive
covariance; (b) Negative covariance
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 99/148
HANOI – 2024 99 / 148
3.2 COVARIANCE AND CORRELATION 3.2.1 Covariance. Covariance Matrix
(a) Covariance
y
y
x
(a) (b)
Figure 4: Joint probability distributions and the sign of covariance between X and Y : Zero covariance
(a) Covariance
Example 21
In Example 1, the random variables X and Y are the number of blue pens and red
pens selected, respectively. Is the covariance between X and Y positive or negative?
Solution. As the number of blue pens increase, the number of red pens decreases.
Therefore, X and Y have a negative covariance. This can be verified from the joint
probability distribution in Table 1.
(a) Covariance
✍ The equality of the two expressions for covariance in Equation (??) is shown.
(a) Covariance
Example 22
For the integrated circuits tests in Example 2, we found in Example 7 that the
probability model for X and Y is given by the following matrix.
Find Cov(X, Y ).
(a) Covariance
Solution.
■ By (5),
2 X
X 2
E(XY ) = xyPX,Y (x, y) = 1 × 1 × 0.09 + 2 × 2 × 0.81 = 3.33.
x=0 y=0
Covariance
Theorem 19
(a) Cov(X, Y ) = Cov(Y, X).
(b) Var(X) = Cov(X, X) and Var(Y ) = Cov(Y, Y ).
(c) If X and Y are independen random variables, cov(X, Y ) = 0. However, if the
covariance between two random variables is zero, we cannot immediately
conclude that the random variables are independent.
(d) Cov(aX, Y ) = aCov(X, Y ).
(e) Cov(X + Z, Y ) = Cov(X, Y ) + Cov(Z, Y ).
(a) Covariance
Example 23
The joint probability distribution of X and Y is
HH Y
H −1 0 1
X HHH
−1 4/15 1/15 4/15
1 0 2/15 0
(a) Covariance
Solution.
(a) We have
9 4 2 7
E(X) = (−1) × +0× +1× =− .
15 15 15 15
5 5 5
E(Y ) = (−1) × +0× +1× = 0.
15 15 15
4 4
E(XY ) = (−1) × (−1) × + (−1) × (1) × + 1 × (−1) × 0 + 1 × 1 × 0 = 0.
15 15
Hence Cov(X, Y ) = E(XY ) − E(X) × E(Y ) = 0.
(a) Covariance
X −1 0 1 Y −1 0 1
P 9/15 4/15 5/15 P 5/15 5/15 5/15
(a) Covariance
Definition 13 (Correlation)
The correlation of X and Y is rX,Y = E(XY ).
(a) Covariance
Theorem 20
Let a, b, and c are real numbers.
✍
■ If a = 1, b = 1, and c = 0, Var(X + Y ) = Var(X) + Var(Y ) + 2Cov(X, Y ).
■ If a = 1, b = −1, and c = 0, Var(X − Y ) = Var(X) + Var(Y ) − 2Cov(X, Y ).
(a) Covariance
Corollary 1
If X and Y are independent,
and
Var(X ± Y ) = Var(X) + Var(Y ).
Correlation Coefficient
✍ There is another measure of the relationship between two random variables that is
often easier to interpret than the covariance.
Definition 17 (Correlation Coefficient)
The correlation coefficient of two random variables X and Y is
Cov(X, Y ) Cov(X, Y )
ρX,Y = p = . (29)
Var(X)Var(Y ) σX σY
Correlation Coefficient
Example 24
The joint PDF of two continuous random variables X and Y is
(
1
xy, 0 ≤ x ≤ 2, 0 ≤ y ≤ 4,
fX,Y (x, y) = 16
0, otherwise.
Find ρX,Y .
Correlation Coefficient
Solution.
■ We have
Z2 Z4 Z2 Z4
1 2
4 2
8
E(X) = x ydy dx = , E(Y ) = xy dy dx = ,
16 3 3
0 0 0 0
Z2 Z4
2 2
32
E(XY ) = x y dy dx =
9
0 0
■ Hence,
32 4 8
Cov(X, Y ) = − × = 0.
9 3 3
■ Therefore, ρX,Y = 0.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 115/148
HANOI – 2024 115 / 148
3.2 COVARIANCE AND CORRELATION 3.2.2 Correlation Coefficient
Correlation Coefficient
−1 ≤ ρX,Y ≤ 1.
If ρXY equals +1 or −1, it can be shown that the points in the joint probability
distribution that receive positive probability fall exactly along a straight line. Two random
variables with nonzero correlation are said to be correlated. Similar to covariance, the
correlation is a measure of the linear relationship between random variables.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 116/148
HANOI – 2024 116 / 148
3.2 COVARIANCE AND CORRELATION 3.2.2 Correlation Coefficient
Correlation
Example 25
Suppose that the random variable X has the following distribution:
0, 2,
if x = 1,
0, 6, if x = 2,
PX (x) =
0, 2, if x = 3,
0, otherwise.
0, 2, if y = 7,
0, 6, if y = 9,
Let Y = 2X + 5. That is, PY (y) =
0, 2, if y = 11,
0, otherwise.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 117/148
HANOI – 2024 117 / 148
3.2 COVARIANCE AND CORRELATION 3.2.2 Correlation Coefficient
Correlation
11 0, 2
9 0, 6
7 0, 2 ρ=1
x
1 2 3
Refer to Fig. 5. Because X and Y are linearly related, ρXY = 1. This can be verified by
direct calculations (29)!
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 118/148
HANOI – 2024 118 / 148
3.2 COVARIANCE AND CORRELATION 3.2.2 Correlation Coefficient
Correlation Coefficient
Theorem 22
If X and Y are random variables such that Y = aX + b
−1,
a < 0,
ρX,Y = 0, a = 0,
1, a > 0.
Theorem 23
If X and Y are independent,
ρX,Y = 0. (30)
Exercise 14
For the random variables X and Y in Problem 1, find
Y
(a) The expected value of W = X .
(b) The correlation, E(XY ).
(c) The covariance, Cov(X, Y ).
(d) The correlation coefficient, ρX,Y .
(e) The variance of X + Y , Var(X + Y ).
Exercise 15
Random variables X and Y have the joint probability distribution
HH Y
H 1 2 3
X HH
H
1 0.17 0.13 0.25
2 0.10 0.30 0.05
Exercise 16
Random variables X and Y have joint PDF
(
(x + y)/3, 0 ≤ x ≤ 1, 0 ≤ y ≤ 2,
fX,Y (x, y) =
0, otherwise.
CONTENT
1 3.1 TWO RANDOM VARIABLES
3.1.1 Joint Probability Distributions
3.1.2 Marginal Probability Distributions
3.1.3 Conditional Probability Distributions
3.1.4 Independence
Exercises for Section 3.1
2 3.2 COVARIANCE AND CORRELATION
3.2.1 Covariance. Covariance Matrix
3.2.2 Correlation Coefficient
Exercises for Section 3.2
3 3.3 BIVARIATE NORMAL DISTRIBUTION
3.3.1 Joint Probability Distributions
3.3.2 Marginal Probability Distributions
3.3.3 Conditional Probability Distributions
3.3.4 Correlation
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 123/148
HANOI – 2024 123 / 148
3.3 BIVARIATE NORMAL DISTRIBUTION 3.3.1 Joint Probability Distributions
1 n 1 h (x − µ )2
X
fX,Y (x, y) = p exp − 2 2
2πσX σY 1 − ρ2 2(1 − ρ ) σX
(31)
2ρ(x − µX )(y − µY ) (y − µY )2 io
− +
σX σY σY2
where −∞ < x, y < +∞, σX , σY > 0, −∞ < µX , µY < +∞, and −1 < ρ < 1.
✍ If µX = µY = 0, σX = σY = 1, and ρ = 0,
1 − x2 +y2
fX,Y (x, y) = e 2 for all x, y (32)
2π
Theorem 24
If X and Y have a bivariate normal distribution with joint probability density
fX,Y (x, y), the marginal probability distributions of X and Y are normal with
means µX and µY and standard deviations σX and σY , respectively, and
(x−µX )2 (y−µY )2
1 −
2σ 2
1 −
2σ 2
fX (x) = √ e X and fY (y) = √ e Y .
σX 2π σY 2π
Theorem 25
If X and Y have a bivariate normal distribution with joint probability density
fX,Y (x, y), the conditional probability distribution of Y given X = x is normal with
mean
σY
µY |x = µY + ρ (x − µX ) (33)
σX
and variance
Exercise 17
Suppose X and Y have a bivariate normal distribution with σX = 50.04,
σY = 50.08, µX = 53.00, µY = 57.70, and ρ = 0. Determine the following:
(a) P (2.95 < X < 3.05).
(b) P (7.60 < Y < 7.80).
(c) P (2.95 < X < 3.05, 7.60 < Y < 7.80).
Exercise 18
In an acid-base titration, a base or acid is gradually added to the other until they
have completely neutralized each other. Let X and Y denote the milliliters of acid
and base needed for equivalence, respectively. Assume X and Y have abivariate
normal distribution with σX = 5 mL, σY = 2 mL, µX = 120 mL, µY = 100 mL, and
ρ = 0.6. Determine the following:
(a) Covariance between X and Y .
(b) Marginal probability distribution of X.
(c) P (X < 16).
(d) Conditional probability distribution of X given that Y = 102.
(e) P (X < 116|Y = 102).
CONTENT
1 3.1 TWO RANDOM VARIABLES
3.1.1 Joint Probability Distributions
3.1.2 Marginal Probability Distributions
3.1.3 Conditional Probability Distributions
3.1.4 Independence
Exercises for Section 3.1
2 3.2 COVARIANCE AND CORRELATION
3.2.1 Covariance. Covariance Matrix
3.2.2 Correlation Coefficient
Exercises for Section 3.2
3 3.3 BIVARIATE NORMAL DISTRIBUTION
3.3.1 Joint Probability Distributions
3.3.2 Marginal Probability Distributions
3.3.3 Conditional Probability Distributions
3.3.4 Correlation
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 131/148
HANOI – 2024 131 / 148
3.4 CENTERAL LIMIT THEOREM 3.4.1. Central Limit Theorem
✍ The normal approximation for depends on the sample size n. The following figures
show the distributions of average scores obtained when tossing one, two, three, and five
dice, respectively.
■ To use the central limit theorem, we observe that we can express the iid sum
Wn = X1 + X2 + · · · + Xn as
√
Wn = σ nZn + nµ (36)
For large n, the central limit theorem says that FZn (z) ≈ Φ(z). This approximation
is the basis for practical applications of the central limit theorem.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 134/148
HANOI – 2024 134 / 148
3.4 CENTERAL LIMIT THEOREM 3.4.1. Central Limit Theorem
Figure 7: n = 1 Figure 8: n = 2
✍ To appreciate why the ±0.5 terms increase the accuracy of approximation, consider
the following simple but dramatic example in which k1 = k2 .
0.5 −0.5
P (8 ≤ K ≤ 8) ≈ P (7.5 ≤ X ≤ 8.5) = Φ √ −Φ √ = 2Φ(0.23) − 1
4.8 4.8
= 0.1819.
8
■ The exact value is P (X = 8) = C20 (0.4)8 (0.6)12 ≈ 0.1797.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 141/148
HANOI – 2024 141 / 148
3.4 CENTERAL LIMIT THEOREM 3.4.2 Applications of the Central Limit Theorem
Example 28
K is the number of heads in 100 flips of a fair coin. What is P (50 ≤ K ≤ 51)?
Solution.
■ Since K is a binomial (n = 100, p = 1/2) random variable,
50
P (50 ≤ K ≤ 51) = P (K = 50)+P (K = 51) = C100 (0, 5)100 +C100
51
(0, 5)100 ≈ 0.1576.
■ Since E(K) = 50 and σK = 5, the ordinary central limit theorem approximation
produces
51 − 50 50 − 50
P (50 ≤ K ≤ 51) ≈ Φ √ −Φ √ = Φ(0.33) − Φ(0)
25 25
= 0.62930 − 0.5 = 0.1293.
Nguyễn Thị Thu Thủy (FaMI-HUST) ProSta-CHAP3 142/148
HANOI – 2024 142 / 148
3.4 CENTERAL LIMIT THEOREM 3.4.2 Applications of the Central Limit Theorem
■ This approximation error of roughly 50% occurs because the ordinary central limit
theorem approximation ignores the fact that the discrete random variable K has
two probability masses in an interval of length 1.
■ As we see next, the De Moivre–Laplace approximation is far more accurate.
51 + 0.5 − 50 50 − 0.5 − 50
P (50 ≤ K ≤ 51) ≈ Φ √ −Φ √ = Φ(0.3) − Φ(0.1)
25 25
= 0.61791 + 0.53983 − 1 = 0.1577.
✍
■ Although the central limit theorem approximation provides a useful means of
calculating events related to complicated probability models, it has to be used with
caution. When the events of interest are confined to outcomes at the edge of the
range of a random variable, the central limit theorem approximation can be quite
inaccurate.
■ In these applications, it is necessary to resort to more complicated methods than a
central limit theorem approximation to obtain useful results. In particular, it is often
desirable to provide guarantees in the form of an upper bound rather than the
approximation offered by the central limit theorem.
Exercise 19
Let X1 , X2 , . . . be an iid sequence of Poisson random variables, each with expected
value E(X) = 1. Let Wn = X1 + · · · + Xn . Use the improved central limit theorem
approximation to estimate P (Wn = n). For n = 4, 25, 64, compare the
approximation to the exact value of P (Wn = n).
Exercise 20
Integrated circuits from a certain factory pass a certain quality test with probability
0.8. The outcomes of all tests are mutually independent.
(a) What is the expected number of tests necessaryto find 500 acceptable circuits?
Sol. 625
(b) Use the central limit theorem to estimate the probability of finding 500
acceptable circuits in a batch of 600 circuits.
(c) Use the central limit theorem to calculate theminimum batch size for finding
500 acceptable circuits with probability 0.9 or greater.
z 0 1 2 3 4 5 6 7 8 9
0.0 0.50000 50399 50798 51197 51595 51994 52392 52790 53188 53586
0.1 53983 54380 54776 55172 55567 55962 56356 56749 57142 57535
0.2 57926 58317 58706 59095 59483 59871 60257 60642 61026 61409
0.3 61791 62172 62556 62930 63307 63683 64058 64431 64803 65173
0.4 65542 65910 66276 66640 67003 67364 67724 68082 68439 68739
0.5 69146 69447 69847 70194 70544 70884 71226 71566 71904 72240
0.6 72575 72907 73237 73565 73891 74215 74537 74857 75175 75490
0.7 75804 76115 76424 76730 77035 77337 77637 77935 78230 78524
0.8 78814 79103 79389 79673 79955 80234 80511 80785 81057 81327
0.9 81594 81859 82121 82381 82639 82894 83147 83398 83646 83891
1.0 84134 84375 84614 84850 85083 85314 85543 85769 85993 86214
1.1 86433 86650 86864 87076 87286 87493 87698 87900 88100 88298
1.2 88493 88686 88877 89065 89251 89435 89617 89796 89973 90147
1.3 90320 90490 90658 90824 90988 91149 91309 91466 91621 91774
1.4 91924 92073 92220 92364 92507 92647 92786 92922 93056 93189
1.5 93319 93448 93574 93699 93822 93943 94062 94179 94295 94408
1.6 94520 94630 94738 94845 94950 95053 95154 95254 95352 95449
1.7 95543 95637 95728 95818 95907 95994 96080 96164 96246 96327
1.8 96407 96485 96562 96638 96712 96784 96856 96926 96995 97062
1.9 97128 97193 97257 97320 97381 97441 97500 97558 97615 97670
x 0 1 2 3 4 5 6 7 8 9
2.0 97725 97778 97831 97882 97932 97982 98030 98077 98124 98169
2.1 98214 98257 98300 98341 98382 98422 99461 98500 98537 98574
2.2 98610 98645 98679 98713 98745 98778 98809 98840 98870 98899
2.3 98928 98956 98983 99010 99036 99061 99086 99111 99134 99158
2.4 99180 99202 99224 99245 99266 99285 99305 99324 99343 99361
2.5 99379 99396 99413 99430 99446 99261 99477 99492 99506 99520
2.6 99534 99547 99560 99573 99585 99598 99609 99621 99632 99643
2.7 99653 99664 99674 99683 99693 99702 99711 99720 99728 99763
2.8 99744 99752 99760 99767 99774 99781 99788 99795 99801 99807
2.9 99813 99819 99825 99831 99836 99841 99846 99851 99856 99861
3.0 0,99865 3,1 99903 3,2 99931 3,3 99952 3,4 99966
3.5 99977 3,6 99984 3,7 99989 3,8 99993 3,9 99995
4.0 999968
4.5 999997
5.0 99999997