Chapter 3: Discrete
Distributions
Probability and Statistics for Science and Engineering
with Examples in R
Hongshik Ahn
(1) Random Variable (r.v.)
Definition: any function that assigns a numerical value to each possible
outcome of an experiment
Example 3.1: X: # heads obtained in three tosses of a fair coin
Outcome X Value of X Event P(X = 0) =
HHH 3 X=0 {TTT} P(X = 1) =
HHT 2 X=1 {HTT, THT, TTH}
P(X = 2) =
HTH 2 X=2 {HHT, HTH, THH}
P(X = 3) =
HTT 1 X=3 {HHH}
THH 2
THT 1
TTH 1
TTT 0
Types of Random Variables
• Discrete Random Variables have a countable
number of possible values.
• Continuous Random Variables can take on any
value in an interval and cannot be enumerated.
eg) : # heads obtained in three tosses of a coin: discrete
: amount of precipitation produced by a storm: continuous
(2) Probability Distribution
1. Probability distribution:
2.
3.
Example 3.2: Which of the following is a probability distribution?
• for
, so not a probability distribution.
• for
, but
so not a probability distribution.
Example 3.3: Foreign made cars: 30%. Four cars are selected at random.
number of foreign made cars. F: foreign made, D: domestic
𝑋=0 𝑋=1 𝑋=2 𝑋=3 𝑋=4
DDDD DDDF DDFF DFFF FFFF
DDFD DFDF FDFF
DFDD DFFD FFDF
FDDD FDDF FFFD
FDFD
FFDD
Probability Model: An assumed form of the probability distribution that
describes the chance behavior for a r.v. X
Parameters: the relevant population quantities by which probabilities are
expressed
Example 3.4: Bernoulli Trial
(a) Each trial yields one of two outcomes: Success (S) or Failure (F)
(b)
(c) Each trial is independent.
( 𝑝 isthe parameter)
Cumulative Distribution Function
(CDF)
• CDF of a random variable:
• For a discrete random variable:
1. CDF has a “jump” at each possible value equal to the
probability of that value.
2. The graph of the cdf will be a step function.
3. The graph increases from a minimum of 0 to a
maximum of 1.
Example 3.3 (continued)
0 0.2401 0.2401
1
1 0.4116
0.4116 0.6517
0.6517
2
2 0.2646
0.2646 0.9163
0.9163
3 0.0756 0.9919
3 0.0756 0.9919
4 0.0081 1
4 0.0081 1
0.2401, 0.4116, 0.2646, 0.0756, 0.0081
For
9
Example 3.5
0 0.1 0.1
1
1 0.2
0.2 0.3
0.3
2
2 0.3
0.3 0.6
0.6
3 0.2 0.8
3 0.2 0.8
4 0.2 1
4 0.2 1
(3) Mean and Variance of Discrete R.V.s
Definition: Mean (Expected value) of a discrete r.v. :
𝐸 ( 𝑋 ) =𝜇=∑ 𝑥 𝑓 ( 𝑥 )
all 𝑥
Example 3.6: What is the expected number of heads in three tosses of a
fair coin?
X: # heads in three tosses of a fair coin
0
1
1
2
3
2
Total
3
Total
Example 3.7: Mean of a Bernoulli r.v.:
𝑝 if 𝑥=1
{
𝑓 ( 𝑥 ) = 1 − 𝑝 if 𝑥= 0
0 otherwise
𝜇= 𝐸 ( 𝑋 )=∑ 𝑥𝑓 ( 𝑥 ) =0 ∙ 𝑓 ( 0 ) +1 ∙𝑓 ( 1 )= 𝑝
𝑥
12
𝐸 ( h ( 𝑋 ) ) = ∑ h ( 𝑥 ) 𝑓 ( 𝑥)
a≪ 𝑥
Example 3.9: In flipping 3 balanced coins find
0 0 1/8 0
0 0 1/8 0
1 0 3/8 0
1 0 3/8 0
2 6 3/8 9/4
2 6 3/8 9/4
3 24 1/8 3
3 24 1/8 3
Total 1 21/4
Total 1 21/4
3 9 21
𝐸 ( 𝑋 − 𝑋 ) =0+0+ +3= =5.25
4 4
Variance of a discrete r.v. :
2 2 2
𝑉𝑎𝑟 ( 𝑋 )=𝜎 = 𝐸 [ ( 𝑋 − 𝜇 ) ]=∑ ( 𝑥− 𝜇 ) 𝑓 ( 𝑥 )
all 𝑥
sd
2 2 2 2
Alternatively 𝑉𝑎𝑟 ( 𝑋 )=𝐸 ( 𝑋 ) − 𝜇 =∑ 𝑥 𝑓 ( 𝑥 ) − ∑ 𝑥𝑓 ( 𝑥 )
all 𝑥 [ ]
all 𝑥
2 2 2 2 2
Proof:
2 2 22 2 2 2 2
𝑉𝑎𝑟(𝑋)=𝐸(𝑋−𝜇) =∑(𝑥−𝜇) 𝑓(𝑥)=∑(𝑥 −2𝜇𝑥+𝜇 )𝑓(𝑥)=∑𝑥 𝑓(𝑥)−2𝜇∑𝑥𝑓(𝑥)+¿𝜇 ∑𝑓(𝑥)=∑𝑥 𝑓(𝑥)−2𝜇 +𝜇 =∑𝑥 𝑓(𝑥)−𝜇 =𝐸(𝑋 )−𝜇 ¿
al 𝑥 al 𝑥 al 𝑥 al 𝑥 al 𝑥 al 𝑥 al 𝑥
Example 3.10: Mean and variance
1 0.3 0.3 4 1.2
2 0.4 0.8 1 0.4
5 0.2 1.0 4 0.8
9 0.1 0.9 36 3.6
Total 1 3 6
Example 3.11: Alternative way to calculate the variance
1 1 0.3 0.3 0.3
2 4 0.4 0.8 1.6
5 25 0.2 1.0 5.0
9 81 0.1 0.9 8.1
Total 1 3
2 2
𝑉𝑎𝑟
( 𝑋 )= 𝐸(𝑋 ¿ ¿ 2)− 𝜇 =15 −3 =6 ¿
(4) Binomial Distribution
The binomial distribution:
: a fixed number of independent Bernoulli trials
: the probability of success in each trial
# successes in trials
binomial random variable:
• Examples in which follows a binomial distribution:
Flip a coin times. Let number of heads.
Provide a medical treatment to subjects. Record whether
each subject survives. Let number of survivors.
Ask survey respondents: “Do you believe in capital
punishment?” Let number who answer “Yes”.
Example 3.3 (continued)
30% of the automobiles in a certain city are foreign made.
Four cars are selected at random.
: #cars sampled that are foreign made
F: foreign made, D: domestic
DDDD DDDF DDFF DFFF FFFF
DDFD DFDF FDFF
DFDD DFFD FFDF
FDDD FDDF FFFD
FDFD
FFDD
Prob. of
each
outcome
#outcomes
• If is binomially distributed, then
where = 0, 1, 2,
• is a probability distribution, because
1) and
• : a Bernoulli trial
• Binomial tables: Table A.2
If then
Finding binomial probabilities using R:
For ,
• Probability distribution:
>dbinom(x, n, p)
• cdf:
>pbinom(x, n, p)
Table A.2 Binomial Distribution Table
20
Example 3.13:
(1)
Using R, it can be obtained as
>pbinom(7, 12, 0.3)
(2)
Using R, it can be obtained as
>dbinom(7, 12, 0.3)
(3)
Using R, it can be obtained as
>1-pbinom(6, 12, 0.3)
(4)
Using R, it can be obtained as
>pbinom(7, 12, 0.3) – pbinom(3, 12, 0.3)
For ,
Example 3.14: If the probability is 0.1 that a certain device fails a comprehensive
safety test, what are the probabilities that among 15 of such devices,
(a) at most two will fail?
(b) t least three will fail?
Example 3.15: Find the mean and variance of the probability distribution of
the number of heads obtained in three flips of a balanced coin.
Var()
Binomial distribution
: skewed to the right
: skewed to the left
: symmetric
23
(5) Hypergeometric Distribution
(Sampling w/o replacement)
1)Finite population with individuals (binomial: infinite pop.)
2) or . There are successes in the population.
3) Sample size: . Each subset of size is equally likely to be chosen.
𝑎 𝑁 −𝑎
𝑃 ( 𝑋=𝑥 )=
( )(
𝑥 𝑛−𝑥 )
( 𝑁𝑛 )
max { 0 , 𝑛− 𝑁 +𝑎 } ≤ 𝑥 ≤ min { 𝑛 , 𝑎 } ( ∵ 𝑥 ≤𝑎∧𝑛− 𝑥 ≤ 𝑁 −𝑎 )
Using R,
• Probability distribution:
>dhyper(x, a, N-a, n)
• cdf:
>phyper(x, a, N-a, n)
Example 3.17: A shipment of 25 CD’s contains 5 that are defective. If 10
of them are randomly chosen without replacement, what is the
probability that 2 of the 10 will be defective?
Using R,
>dhyper(2, 5, 20, 10)
Hypergeometric Distribution
Let Then 𝐸 ( 𝑋 ) =𝑛𝑝
𝑁 −𝑛
𝑉𝑎𝑟 ( 𝑋 )= 𝑛𝑝 ( 1− 𝑝 ) ( 𝑁 −1 )
If then
finite population correction factor
• Binomial approximation to the hypergeometric, when
and is not too close to either 0 or 1.
Example 3.18: From Example 3.17, for a lot of 100 CD’s, 20 are
defective . Find the probability that among a randomly selected
sample of 10 CD’s, 2 are defective, by using
(a) The formula for the hypergeometric distribution:
>dhyper(2, 20, 80, 10)
(b) The binomial approximation to the hypergeometric:
>dbinom(2, 10, 0.2)
(6) Poisson Distribution
•Properties:
1. The probability that an event occurs in the interval is
proportional to the length of the interval.
2. Two events cannot occur at exactly the same instant.
3. Events occur independently.
• The probability of events occurring in a time period for a
Poisson random variable with parameter is
x e
P ( X x) f ( x) , x 0,1,2, (e=2.71828)
x!
𝐸 ( 𝑋 ) =𝑉𝑎𝑟 ( 𝑋 )= 𝜆
Poisson Distribution
Maclaurin Series:
2 3 ∞ 𝑥
𝜆 𝜆 𝜆 𝜆
𝑒 =1+ 𝜆+ + + ⋯=∑
2! 3! 𝑥=0 𝑥 !
𝜆 𝑥 𝑒− 𝜆 − 𝜆 𝜆
∑ 𝑓 (𝑥 ¿)=∑ 𝑥!
=𝑒 𝑒 =1 ¿
Therefore, is a probability distribution.
• Use of Poisson distributions, examples:
• number of occurrences in a given time
• distribution of bomb hits in an area
• distribution of fish in a lake
• Poisson tables: Table A.3
Poisson Distribution
• The Poisson distribution is highly skewed for small values of λ.
• As λ increases, the distribution becomes more symmetric.
Table A.3 Poisson Distribution Table
Finding Poisson Probabilities Using R
For Poisson,
• Probability distribution:
>dpois(x, )
• cdf:
>ppois(x, )
Example 3.19: A 911 operator handles 4 calls every 3 hours
on average.
(a) What is the probability of no calls in the next hour?
Let be the number of calls in an hour. Then
>dpois(0,4/3)
(b) Find the probability of at most two calls in the next hour.
>ppois(2, 4/3)
Example 3.20:
On average, 12 cars pass a toll gate booth in a minute during rush hours.
(a) Probability that one car passes the booth in 3 seconds:
>dpois(1, 0.6)
(b) Probability that at least two cars pass in 5 seconds:
>1-ppois(1, 1)
(c) Probability that at most one car passes in 10 seconds:
>ppois(1, 2)
Poisson approximation to the binomial
Let .
Let and such that remains constant ( is a moderate number).
Then
such that
acceptable
excellent approximation
Example 3.21: A publisher of mystery novels tries to keep its books
free of typos. The probability of any given page containing at least one
typo is 0.003 and errors are independent from page to page. What is the
approximate probability that a 500 page book has
(a) exactly 1 page with typos?
If is the number of pages with typos, then .
>dpois(1, 1.5)
(b) at most 2 pages with typos?
>ppois(2, 1.5)
(7) Geometric Distribution
1.Each trial: Success (S) or Failure (F)
2. Independent trials
3.
4. Continues until the first success is observed
• Probability of getting the first success on the -th trial:
• Using R,
probability distribution
>dgeom(x-1, p)
cdf
>pgeom(x-1, p)
Example 3.23: A fair die is tossed until a certain number
appears. What is the probability that the first six appears at
the fifth toss?
Using R,
>dgeom(4, 1/6)
(8) Chebyshev’s Inequality
If a
probability distribution has mean and sd , the
probability of deviation from by at least is at
most .
Equivalently,
Example 3.24: The bearings made by a certain process have a mean radius of 18
mm with a standard deviation of 0.025 mm. With what minimum probability
can we assert that the radius of a bearing is between 17.9 mm and 18.1 mm?
At least 15/16
Example 3.25: What can you say about the probability that a random
variable falls within two standard deviations of its mean?
(9) Multinomial Distribution
Each trial results in any one of possible outcomes.
: #trials resulting in outcome
,
+
Example 3.26: Toss a drum 10 times.
Let #heads, #sides, #tails.
Let n10, . Then
2, 5, 3)