Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views26 pages

2.11.sampling and Central Limit Theorem

Sampling techniques and central limit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views26 pages

2.11.sampling and Central Limit Theorem

Sampling techniques and central limit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Probability & Its Applications

Sampling Distributions

Random Sample, Central Limit Theorem


Review
I. What’s in last lecture?
Normal Probability Distribution.

II. What's in this lecture?


Random Sample, Central Limit Theorem
Introduction
• Parameters are numerical descriptive
measures for populations.
– Two parameters for a normal distribution:
mean m and standard deviation s.
– One parameter for a binomial distribution:
the success probability of each trial p.
• Often the values of parameters that specify the
exact form of a distribution are unknown.
• You must rely on the sample to learn about
these parameters.
Sampling
Examples:
• A pollster is sure that the responses to his
“agree/disagree” question will follow a binomial
distribution, but p, the proportion of those who
“agree” in the population, is unknown.
• An agronomist believes that the yield per acre of a
variety of wheat is approximately normally
distributed, but the mean m and the standard
deviation s of the yields are unknown.
✓ If you want the sample to provide reliable
information about the population, you must select
your sample in a certain way!
Simple Random Sampling
• The sampling plan or experimental
design determines the amount of
information you can extract, and often
allows you to measure the reliability of
your inference.
• Simple random sampling is a method
of sampling that allows each possible
sample of size n an equal probability of
being selected.
Sampling Distributions
•Any numerical descriptive measures calculated from
the sample are called statistics.
• Statistics vary from sample to sample and hence are
random variables. This variability is called sampling
variability.
• The probability distributions for statistics are called
sampling distributions.
• In repeated sampling, they tell us what values of the
statistics can occur and how often each value occurs.
Example
Population: 3, 5, 2, 1
Draw samples of size n = 3 without replacement

Possible samples x
3, 5, 2 10 / 3 = 3.33
3, 5, 1 9/3 = 3
3, 2, 1 6/3 = 2 p(x)
5, 2, 1 8 / 3 = 2.67
1/4

Each value of x-bar is x


equally likely, with 2 3
probability 1/4
Example
Consider a population that consists of the numbers 1, 2,
3, 4 and 5 generated in a manner that the probability of
each of those values is 0.2 no matter what the previous
selections were. This population could be described as
the outcome associated with a spinner such as given
below with the distribution next to it.

x p(x)
1 0.2
2 0.2
3 0.2
4 0.2
5 0.2
Example
If the sampling distribution for the means of samples
of size two is analyzed, it looks like

Sample Sample frequency p(x)


1, 1 1 3, 4 3.5
1 1 0.04
1, 2 1.5 3, 5 4
1, 3 2 4, 1 2.5
1.5 2 0.08
1, 4 2.5 4, 2 3 2 3 0.12
1, 5 3 4, 3 3.5 2.5 4 0.16
2, 1 1.5 4, 4 4 3 5 0.20
2, 2 2 4, 5 4.5
3.5 4 0.16
2, 3 2.5 5, 1 3
2, 4 3 5, 2 3.5
4 3 0.12
2, 5 3.5 5, 3 4 4.5 2 0.08
3, 1 2 5, 4 4.5 5 1 0.04
3, 2 2.5 5, 5 5 25
3, 3 3
Example
The original distribution and the sampling distribution
of means of samples with n=2 are given below.

1 2 3 4 5
1 2 3 4 5

Original distribution Sampling distribution


n=2
Example
Sampling distributions for n=3 and n=4 were
calculated and are illustrated below. The shape is
getting closer and closer to the normal distribution.

1 2 3 4 5
1 2 3 4 5
Original distribution Sampling distribution n = 2

1 2 3 4 5 1 2 3 4 5
Sampling distribution n = 3 Sampling distribution n = 4
Sampling Distribution of x
If a random sample of n measurements is selected from a
population with mean m and standard deviation s, the
sampling distribution of the sample mean x will have a mean
m =m
x

and a standard deviation

s =s / n
x

Central Limit Theorem: If random samples of n


observations are drawn from a nonnormal population with
finite m and standard deviation s , then, when n is large, the
sampling distribution of the sample mean x is approximately
normally distributed, with mean m and standard deviation
s / n . The approximation becomes more accurate as n
becomes large.
Why is this Important?
✓The Central Limit Theorem also implies that the
sum of n measurements is approximately normal with
mean nm and standard deviation s n .

✓Many statistics that are used for statistical inference


are sums or averages of sample measurements.

✓When n is large, these statistics will have


approximately normal distributions.

✓This will allow us to describe their behavior and


evaluate the reliability of our inferences.
How Large is Large?
If the sample is normal, then the sampling
distribution of x will also be normal, no matter
what the sample size.

When the sample population is approximately


symmetric, the distribution becomes approximately
normal for relatively small values of n.

When the sample population is skewed, the sample


size must be at least 30 before the sampling
distribution of x becomes approximately normal.
Illustrations of Sampling
Distributions

Population
n= 4
n=9
n = 25

Symmetric normal like population


Illustrations of Sampling
Distributions

Population
n=4
n=10
n=30

Skewed population
Finding Probabilities for the Sample Mean
✓If the sampling distribution of x is normal or
approximately normal, standardize or rescale the
interval of interest in terms of
x−m
z=
s/ n
✓Find the appropriate area using Table 3.

Example: A random sample of size n = 16 from a normal


distribution with m = 10 and s = 8.
x−m 12 − 10
P( x  12) = P(  )
s/ n 8 / 16

= P( z  1) = .5 − .3413 = .1587
Example
A soda filling machine is supposed to fill cans of
soda with 12 fluid ounces. Suppose that the fills are actually
normally distributed with a mean of 12.1 oz and a standard
deviation of .2 oz. The probability of one can less than 12 is
x−m 12 − 12.1
P( x  12) = P(  ) = P( z  −.5) = .5 − .1915 = .3085
s .2
What is the probability that the average fill for a 6-pack of soda is
less than 12 oz?
P(x  12) =
x − m 12 − 12.1
P(  )=
s / n .2 / 6
P( z  −1.22) = .1112
The Sampling Distribution
of the Sample Proportion
✓The Central Limit Theorem can be used to
conclude that the binomial random variable x is
approximately normal when n is large, with mean np
and variance npq.
x
✓The sample proportion, pˆ = n is simply a rescaling
of the binomial random variable x, dividing it by n.
✓From the Central Limit Theorem, the sampling
distribution of p̂ will also be approximately
normal, with a rescaled mean and standard deviation.
The Sampling Distribution
of the Sample Proportion
✓A random sample of size n is selected from a
binomial population with parameter p.
✓The sampling distribution of the sample proportion,
x
pˆ =
n
pq
will have mean p and standard deviation n
✓If n is large, and p is not too close to zero or one, the
sampling distribution of p̂ will be approximately
normal.
The standard deviation of p-hat is sometimes called
the STANDARD ERROR (SE) of p-hat.
Finding Probabilities for
the Sample Proportion

✓If the sampling distribution of p̂ is normal or


approximately normal, standardize or rescale the
interval of interest in terms of z = pˆ − p
pq
n
✓Find the appropriate area using Table 3.
Example
The soda bottler in the previous example claims
that only 5% of the soda cans are underfilled.
A quality control technician randomly samples 200
cans of soda. What is the probability that more than
10% of the cans are underfilled?
n = 200
P( pˆ  .10)
S: underfilled can .10 − .05
= P( z  ) = P( z  3.24)
p = P(S) = .05 .05(.95)
q = .95 200
np = 10 nq = 190  .5 − .4990 = .001
This would be very unusual,
OK to use the normal
if indeed p = .05!
approximation
Example
Suppose 3% of the people contacted by phone are
receptive to a certain sales pitch and buy your product.
If your sales staff contacts 2000 people, what is the
probability that more than 100 of the people contacted
will purchase your product?

OK to use the normal


n=2000, p= 0.03, np=60, nq=1940, approximation

.05 − .03
P( pˆ  100 / 2000) = P( z  ) = P( z  5.24)  0
.03(.97)
2000
Key Concepts
I. Sampling Plans and Experimental Designs
Simple random sampling: Each possible
sample is equally likely to occur.
II. Statistics and Sampling Distributions
1.Sampling distributions describe the possible
values of a statistic and how often they occur
in repeated sampling.
2.The Central Limit Theorem states that sums
and averages of measurements from a
nonnormal population with finite mean m and
standard deviation s have approximately
normal distributions for large samples of size
n.
Key Concepts
III. Sampling Distribution of the Sample Mean
1. When samples of size n are drawn from a normal population
with mean m and variance s 2, the sample mean x has a
normal distribution with mean m and variance s 2/n.
2. When samples of size n are drawn from a nonnormal
population with mean m and variance s 2, the Central Limit
Theorem ensures that the sample mean x will have an
approximately normal distribution with mean m and variance
s 2 /n when n is large (n  30).
3. Probabilities involving the sample mean m can be calculated
by standardizing the value of x using z = x − m
s/ n
Key Concepts
IV. Sampling Distribution of the Sample Proportion
1. When samples of size n are drawn from a binomial
population with parameter p, the sample proportion
p̂ will have an approximately normal distribution
with mean p and variance pq /n as long as np  5
and nq  5.
2. Probabilities involving the sample proportion can
be calculated by standardizing the value p̂ using
pˆ − p
z=
pq
n

You might also like