N ATIONAL U NIVERSITY OF S INGAPORE
D EPARTMENT OF S TATISTICS AND DATA S CIENCE
ST2334: P ROBABILITY AND S TATISTICS
F ORMULAE AND FACTS
1. Probability Rules
For events A and B in the sample space S:
(i) P(A) = P(A ∩ B) + P(A ∩ B0 ) (v) P(B) = P(A)P(B|A) + P(A0 )P(B|A0 )
(ii) P(A ∪ B) = P(A) + P(B) − P(A ∩ B) P(A)P(B|A)
P(A∩B)
(vi) P(A|B) = P(B)
(iii) P(B|A) = P(A)
P(A)P(B|A)
(iv) P(A ∩ B) = P(A)P(B|A) (vii) P(A|B) = P(A)P(B|A)+P(A0 )P(B|A0 )
Note: We need P(A) > 0 for items (1iii) onwards, and P(B) > 0 for items (1vi) onwards.
2. Random Variables
(i) Let X and Y be two random variables, and let a and (ii) Let g(·) be an arbitrary function. Then
b be any real numbers. Then
(
E(aX + b) = aE(X) + b ∑ g(x) f (x), if X is discrete;
E[g(X)] = R ∞x∈RX
E(X +Y ) = E(X) + E(Y ) −∞ g(x) f (x) dx, if X is continuous.
3. Joint Distributions
(i) Random variables X and Y are independent if and (iii) The covariance of X and Y is defined to be
only if for any x and y,
fX,Y (x, y) = fX (x) fY (y). cov(X,Y ) = E[(X − E(X))(Y − E(Y ))]
= E(XY ) − E(X)E(Y ).
(ii) Consider any two variable function g(x, y). Then
E[g(X,Y )]
( (iv) cov(aX + b, cY + d) = ac · cov(X,Y )
∑x ∑y g(x, y) fX,Y (x, y), if (X,Y ) discrete;
= RR 2 2
R2 g(x, y) f X,Y (x, y) dy dx, if (X,Y ) continuous. (v) V (aX + bY ) = a V (X) + b V (Y ) + 2ab cov(X,Y )
4. Common Probability Distributions
(i) Binomial(n, p) (v) Uniform(a, b)
n x (
fX (x) = p (1 − p)n−x , x = 0, 1, 2, . . . , n. 1
b−a , a ≤ x ≤ b;
x fX (x) =
0, otherwise.
E(X) = np, V (X) = np(1 − p).
(ii) Negative Binomial(k, p) a+b (b−a)2
E(X) = 2 , V (X) = 12 .
x−1 k
fX (x) = p (1 − p)x−k , x = k, k + 1, . . . . (vi) Exponential(λ )
k−1
(
E(X) = kp , V (X) = (1−p)k
. λ e−λ x , if x ≥ 0;
p2 fX (x) =
(iii) Geometric(p) 0, if x < 0.
fX (x) = (1 − p)x−1 p, x = 1, 2, . . . . E(X) = λ1 , V (X) = 1
.
λ2
1 1−p
E(X) = p , V (X) = p2
. (vii) Normal(µ, σ 2 )
(iv) Poisson(λ )
1 2 2
e−λ λ x fX (x) = √ e−(x−µ) /(2σ ) , −∞ < x < ∞.
fX (x) = , x = 0, 1, 2, . . . . 2πσ
x!
E(X) = λ , V (X) = λ . E(X) = µ, V (X) = σ 2 .
1
5. Confidence Intervals / Test Statistics: Population Mean
The table below gives
• the 100(1 − α)% confidence interval formulas for the population mean µ,
• the test statistics for the (null) hypothesis: H0 : µ = µ0 .
Case Population σ n Confidence Interval Test Statistic
X−µ
I Normal known any x ± zα/2 · √σn Z= √0
σ/ n
∼ N(0, 1)
X−µ
II any known large x ± zα/2 · √σn Z= √0
σ/ n
∼ N(0, 1)
X−µ
III Normal unknown small x ± tn−1,α/2 · √sn T= √0
S/ n
∼ tn−1
X−µ
IV any unknown large x ± zα/2 · √sn Z= √0
S/ n
∼ N(0, 1)
Note that n is considered large when n ≥ 30.
6. Confidence Intervals / Test Statistics: Difference of Population Means
The table below, for two independent samples, gives
• the 100(1 − α)% confidence interval formulas for µ1 − µ2 ,
• the test statistics for the (null) hypothesis: H0 : µ1 = µ2 .
Populations σ1 , σ2 n1 , n2 Confidence Interval Test Statistic
q 2
σ2
r(X−Y )
σ
Any known, unequal n1 ≥ 30, n2 ≥ 30 (x − y) ± zα/2 n11 + n22 Z= ∼ N(0, 1)
σ12 σ22
n1 + n2
q
σ12 σ22
Normal known, unequal any (x − y) ± zα/2 + Z= r(X−Y ) ∼ N(0, 1)
n1 n2 σ12 σ22
n1 + n2
q
s21 s2
Any unknown, unequal n1 ≥ 30, n2 ≥ 30 (x − y) ± zα/2 + n22 Z= r(X−Y ) ∼ N(0, 1)
n1 S12 S22
n1 + n2
q
1 (X−Y )
Normal unknown, equal n1 < 30, n2 < 30 (x − y) ± tn1 +n2 −2,α/2 s p n1 + n12 T= r ∼ tn1 +n2 −2
1 1
Sp n + n
1 2
q
1 (X−Y )
Any unknown, equal n1 ≥ 30, n2 ≥ 30 (x − y) ± zα/2 s p n1 + n12 Z= r ∼ N(0, 1)
1 1
Sp n + n
1 2
(n1 −1)s21 +(n2 −1)s22
Here s2p = n1 +n2 −2 is the pooled sample variance.
For dependent samples, consider the sample Di = Xi −Yi , and use the results in Section 5.
7. Miscellaneous
(i) Roots of the quadratic equation √
−b± b2 −4ac
For the equation ax2 + bx + c = 0, x = 2a .
(ii) Sum of the first n terms of a geometric series
n)
a + ar + ar2 + · · · + arn−1 = a(1−r
1−r .