Statistics and Probability
SAMPLING, SAMPLING DISTRIBUTIONS & CLT
(PART 2)
1st Semester SY 2020-2021 1
Statistics and Probability: Central Limit Theorem
❖ Define the sampling distribution of the sample mean for
normal population
❖ Illustrate the Central Limit Theorem
❖ Define the sampling distribution of the sample mean using
the Central Limit Theorem
❖ Solve problems involving sampling distributions of the
sample mean and Central Limit Theorem
2
Statistics and Probability: Central Limit Theorem
If 𝑋ത is the mean of a random sample of size n taken from a (large
or infinite) population with mean μ and variance 𝝈2 , then the
sampling distribution is approximately normally distributed
𝝈𝟐
with mean 𝐄 𝑿 ഥ = 𝝁 and 𝐕𝐚𝐫 𝑿 ഥ = when n is sufficiently
𝒏
large.
The theorem simply states that if the sample size is
sufficiently large, we can use the normal distribution to
approximate the sampling distribution of 𝑿 ഥ.
𝝈𝟐
ത
Notation: 𝑋~𝑁 𝝁,
𝒏
3
Statistics and Probability: Central Limit Theorem
REMARKS:
• CLT does not require us to get the sample from a normally
distributed population.
• If the distribution of the population is normal, then the sampling
distribution will also be exactly normal, no matter how small the
size of the sample.
• If the population is not normal, the normal approximation in the
theorem will be good if 𝑛 ≥ 30 regardless of the shape (symmetric
or skewed) of the population.
• If 𝑛 < 30, the approximation is good only if the population is not
too different from the normal. 4
Statistics and Probability: Central Limit Theorem
Now, we have already established through Central Limit
Theorem that 𝑋ത is approximately normally distributed with
𝜎2
ത = μ and variance Var(𝑋)
mean E(𝑋) ത = , when n is sufficiently
𝑛
large.
Since it is approximately normally distributed, it can also be
transformed into standard normal distribution. With that, we
now use the limiting formula…
𝑋ത − 𝜇
𝑍= 𝜎
𝑛
Note: It is slightly different in the original transformation formula. 5
Statistics and Probability: Central Limit Theorem
A random sample of size 100 is taken from a large population with
mean μ = 1000 and variance 𝜎 2 = 625.
Approximate the probability of selecting a sample that satisfies:
ഥ > 998
a. 𝑿
ഥ−𝝁|≤𝟏
b. | 𝑿
According to CLT, 𝑋ത will be approximately normally distributed with…
𝜎2 625
ത = μ = 1000 and Var(𝑋)
E(𝑋) ത = = = 6.25
𝑛 100
𝝈𝟐
ത
Notation: 𝑋~𝑁 𝝁= 𝟏𝟎𝟎𝟎, = 𝟔. 𝟐𝟓
𝒏
6
Statistics and Probability: Central Limit Theorem
Given: μ = 1000, 𝜎 2 = 625 → σ = 25 , n = 100
Approximate the probability of selecting a sample that satisfies:
a. 𝑋ത > 998
Solution: Find 𝑃(𝑋ത > 998)…
= 1 − 𝑃(𝑋ത < 998)
998 − 1000
=1−𝑃 𝑍 <
25
100
= 1 − 𝑃(𝑍 < −0.8)
= 1 − 0.2119
= 𝟎. 𝟕𝟖𝟖𝟏 𝒐𝒓 𝟕𝟖. 𝟖𝟏%
7
Statistics and Probability: Central Limit Theorem
Given: μ = 1000, 𝜎 2 = 625 → σ = 25 , n = 100
b. | 𝑋ത − 𝜇 | ≤ 1
Solution: Find 𝑃(| 𝑋ത − 𝜇 | ≤ 1)…
= 𝑃(−1 ≤ 𝑋ത − 𝜇 ≤ 1)
= 𝑃(−1 < 𝑋ത − 𝜇 < 1)
1 𝑋ത − 𝜇 1
=𝑃 − 𝜎 < 𝜎 < 𝜎
𝑛 𝑛 𝑛
1 1
=𝑃 − <𝑍<
25 25
100 100
= 𝑃 −0.4 < 𝑍 < 0.4
= 𝑃 𝑍 < 0.4 − 𝑃(𝑍 < −0.4)
= 0.6554 − 0.3446
= 𝟎. 𝟑𝟏𝟎𝟖 𝒐𝒓 𝟑𝟏. 𝟎𝟖%
8
Statistics and Probability: Central Limit Theorem
Problem: An electrical firm manufactures electric light bulbs that
have a length of life which is normally distributed with mean and
standard deviation equal to 500 and 50 hours, respectively. Find the
probability that a random sample of 15 bulbs will have an average life
of less than 475 hours.
Solution:
Since X (which is the length of life of an electric light bulb) is normally
distributed, then any sample from this population is also normal.
Hence, 𝑿ഥ is normally distributed.
𝟐 𝟐
ഥ ~𝑵 𝝁 = 𝟓𝟎𝟎, 𝝈 = 𝟓𝟎
Notation: 𝑿
𝒏 𝟏𝟓
For this problem, we are asked about 𝑃 𝑋ത < 475 …
9
Statistics and Probability: Central Limit Theorem
𝝈𝟐 𝟓𝟎𝟐
ഥ ~𝑵 𝝁 =
𝑿 𝟓𝟎𝟎, =
𝒏 𝟏𝟓
Given: 𝜇 = 500, 𝜎 = 50, 𝑛 = 15
Solve for 𝑃 𝑋ത < 475 …
475 − 500
=𝑃 𝑍<
50
15
= 𝑃 𝑍 < −1.94
= 𝟎. 𝟎𝟐𝟔𝟐 𝒐𝒓 𝟐. 𝟔𝟐%
10
Statistics and Probability: Central Limit Theorem
Problem: The time it takes students in a cooking school to learn to
prepare seafood gumbo is represented by a random variable with an
average of 3.2 hours and a standard deviation of 1.8 hours. Find the
probability that the average time it will take a class of 36 students to
learn to prepare seafood gumbo is more than 3.1 hours.
Solution:
It is not explicitly stated in the problem that the population where we
got our sample is normally distributed. But according to CLT, if n > 30,
we can use the normal distribution to approximate the sampling
distribution of 𝑿 ഥ . Hence, 𝑿
ഥ is approximately normally distributed.
𝝈𝟐 𝟏.𝟖𝟐
ഥ ~𝑵 𝝁 =
Notation: 𝑿 𝟑. 𝟐, =
𝒏 𝟑𝟔
For this problem, we are asked about 𝑃 𝑋ത > 3.1 … 11
Statistics and Probability: Central Limit Theorem
𝝈𝟐 𝟏. 𝟖𝟐
ഥ ~𝑵 𝝁 = 𝟑. 𝟐,
𝑿 =
𝒏 𝟑𝟔
Given: 𝜇 = 3.2, 𝜎 = 1.8, 𝑛 = 36
Solve for 𝑃 𝑋ത > 3.1 …
= 1 − 𝑃(𝑋ത < 3.1)
3.1 − 3.2
=1−𝑃 𝑍 <
1.8
36
= 1 − 𝑃 𝑍 < −0.33
= 1 − 0.3707
= 𝟎. 𝟔𝟐𝟗𝟑 𝒐𝒓 𝟔𝟐. 𝟗𝟑%
12
Statistics and Probability: Central Limit Theorem
Problem: Suppose a random sample of size n will be selected from a
large population with mean 𝜇 and standard deviation 𝜎 = 6. The
researchers wish that there is a 0.95 or 95% chance of selecting a
sample whose absolute difference between its sample mean and 𝜇, will
be less than 1.5. What sample size must they choose.
Solution:
The statement “there is a 0.95 or 95% chance of selecting a sample
whose absolute difference between its sample mean and 𝜇, will be less
than 1.5” can be written as…
𝑃 𝑋ത − 𝜇 < 1.5 = 0.95
For this problem, we are asked what sample size must they choose…
13
Statistics and Probability: Central Limit Theorem
𝝈𝟐 𝟔𝟐 Simplifying,
ഥ ~𝑵 𝝁 = ? ,
𝑿 =
𝒏 ? 𝑛 𝑛
𝑃 − <𝑍< = 0.95
4 4
Given: 𝜇 = ? , 𝜎 = 6 , 𝑛 = ?
𝑃 𝑧1 < 𝑍 < 𝑧2 = 0.95
We need to find the two z-scores
𝑃 𝑋ത − 𝜇 < 1.5 = 0.95 (𝑧1 𝑎𝑛𝑑 𝑧2 ) who bound the middle area
𝑃 −1.5 < 𝑋ത − 𝜇 < 1.5 = 0.95 equal to 95%.
1.5 𝑋ത − 𝜇 1.5
𝑃 − 𝜎 < 𝜎 < 𝜎 = 0.95
𝑛 𝑛 𝑛
1.5 1.5
𝑃 − <𝑍< = 0.95
6 6
From here, we can say that…
𝑛 𝑛
𝑧1 is the 2.5th percentile.
1 1
𝑃 − <𝑍< = 0.95 → Area to the left equal to 0.025
4 4
𝑛 𝑛 𝑧2 is the 97.5th percentile.
→ Area to the left equal to 0.975
14
Statistics and Probability: Central Limit Theorem
Furthermore, we also know that since the normal curve is symmetric,
𝑧1 𝑎𝑛𝑑 𝑧2 have the SAME MAGNITUDE but OPPOSITE SIGNS.
From the z-table,
• An area to the left equal to 0.025 corresponds to a z-score of 𝒛𝟏 = -1.96
• An area to the left equal to 0.975 corresponds to a z-score of 𝒛𝟐 = 1.96
Recall that…
𝑛 𝑛
𝑃 − <𝑍< = 0.95
4 4
𝑃 𝑧1 < 𝑍 < 𝑧2 = 0.95
Substituting,
𝑃 −1.96 < 𝑍 < 1.96 = 0.95
Therefore, we now look for the SAMPLE SIZE, n.
𝑛
= 1.96 → 𝑛 = 7.84 → 𝒏 = 𝟔𝟏. 𝟒𝟔𝟓𝟔 ≈ 𝟔𝟐
4
Answer: n = 62 (always round up for sample size)
15
Statistics and Probability
SAMPLING, SAMPLING DISTRIBUTIONS & CLT
(PART 2)
16