Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
31 views22 pages

Lec 10

This document summarizes a lecture on confidence intervals. It discusses: 1) Why confidence intervals are needed and their basic properties when the population distribution is normal with known variance. 2) How to calculate confidence intervals for the mean, and that the level of confidence can be adjusted by changing the critical z-value. 3) How precision and sample size are related, and the formula for determining sample size needed based on desired confidence level and interval width. 4) How large sample confidence intervals work using the sample standard deviation instead of the population variance. 5) How to calculate a confidence interval for a population proportion.

Uploaded by

Guilly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views22 pages

Lec 10

This document summarizes a lecture on confidence intervals. It discusses: 1) Why confidence intervals are needed and their basic properties when the population distribution is normal with known variance. 2) How to calculate confidence intervals for the mean, and that the level of confidence can be adjusted by changing the critical z-value. 3) How precision and sample size are related, and the formula for determining sample size needed based on desired confidence level and interval width. 4) How large sample confidence intervals work using the sample standard deviation instead of the population variance. 5) How to calculate a confidence interval for a population proportion.

Uploaded by

Guilly
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

STAT 511

Lecture 12: Confidence Intervals


Devore: Section 7.1-7.2

Prof. Michael Levine

October 22, 2018

Levine STAT 511


Motivation

I Why do we need a confidence interval? Because with each


new sample we have a new parameter estimate (e.g. new
sample mean)....
I Which one do we choose? We do not know the true mean µ
and do not know how close each one is to µ.
I Thus, we want to have some degree of precision reported
together with an estimate
I Suppose our X̄ = 10. We want to say something like...”With
probability 95% the true mean is between 9 and 11”

Levine STAT 511


Basic properties of Confidence Intervals

I Consider normal population distribution with known σ


I We want to estimate unknown µ
I The problem is purely illustrative; in practice, mean is usually
known before the variance (standard deviation)
I We know that X̄ is normally distributed with mean µ and

standard deviation σ/ n.

Levine STAT 511


I Because the area under the normal curve between −1.96 and
1.96 is 0.95, we have
 
X̄ − µ
P(−1.96 ≤ Z ≤ 1.96) = P −1.96 ≤ √ ≤ 1.96 = 0.95
σ/ n
I Simple algebra tells us that
 
σ σ
P X̄ − 1.96 √ < µ < X̄ + 1.96 √ = 0.95
n n

Levine STAT 511


The meaning of the confidence interval

I The event in parentheses above is a random interval with the


left endpoint X̄ − 1.96 √σn and right endpoint X̄ + 1.96 √σn . It
is centered at sample mean X̄ .
I For a given sample X1 = x1 , . . . , Xn = xn , we compute the
observed sample mean x̄ and substitute it in the definition of
our random interval instead of X̄ . The resulting fixed interval
is called 95% confidence interval (CI).
I The usual way to express it is either to say that
 
σ σ
x̄ − 1.96 √ , x̄ + 1.96 √
n n

is a 95% CI for µ

Levine STAT 511


I Alternatively, we say that
σ σ
x̄ − 1.96 √ ≤ µ ≤ x̄ + 1.96 √
n n

with 95%
I A more concise expression is x̄ ± 1.96 √σn

Levine STAT 511


Example

I The average zinc concentration recovered from a sample of


zinc measurements in 36 different locations is found to be 2.6
grams per milliliter. Find the 95% confidence interval for the
mean zinc concentration in the river, assuming it is normally
distributed. The population standard deviation is known to be
0.3
I µ is estimated by x̄ = 2.6. The z-value we need is 1.96.
Hence, the 95% confidence interval is
0.3 0.3
2.6 − 1.96 √ < µ < 2.6 + 1.96 √
36 36
I The above reduces to 2.50 < µ < 2.70.

Levine STAT 511


Other Levels of Confidence

I Any desired level of confidence can be achieved by changing


the critical z-value used in constructing the interval.
I A 100(1 − α)% confidence interval for the mean µ of a
normal population for the known value σ is
 
σ σ
x̄ − zα/2 √ , x̄ + zα/2 √
n n

or, equivalently, by x̄ ± zα/2 · σ/ n

Levine STAT 511


Precision and Sample Size Choice

I It is easy to understand that the confidence interval width is



2zα/2 · σ/ n. Clearly, the more precise confidence interval we
require, the wider it has to be.
I In other words, the confidence level (reliability) is inversely
related to precision. The usual strategy is to specify both the
confidence level and the interval width and then determine the
necessary sample size.
I Consider the response time that is normally distributed with
standard deviation 25 millisec. What sample size is necessary
to ensure that 95% CI has a width of (at most) 10?

Levine STAT 511



I Clearly, the sample size must satisfy 10 = 2 · (1.96)(25/ n).
4·(1.96)2 ·525
I Therefore, n = 100 = 96.04
I In practice, we would require n = 97.
I Thus, to ensure an interval width w we need to have
 σ 2
n = 2zα/2 ·
w

I The half-width of a 95% interval 1.96σ/ n is sometimes
called the bound on the error of estimation.

Levine STAT 511


Large-Sample Confidence Intervals

I If X has the mean µ and variance σ 2 , for large enough n,

X̄ − µ
Z= √
σ/ n

has approximately standard normal distribution, according to


CLT
I But we do not know
q σ! What do we do?
1 Pn
I Solution: use S = n−1 i=1 (Xi − X̄ )2 . It can be verified
that
X̄ − µ
Z= √
S/ n
has approximately standard normal distribution for large n.

Levine STAT 511


I This implies that
s
x̄ ± zα/2 · √
n
is a large-sample confidence interval with confidence level
approximately 100(1 − α)%
I This result is valid regardless of the true distribution of X

Levine STAT 511


Example

I The alternating current breakdown voltage of an insulator


indicates its dielectric strength. We have n = 48 observations
of breakdown voltage of a particular circuit under certain
conditions. The mean is x̄ = 54.7 and s = 5.23. Then, the
95% confidence interval is
√ √
54.7 ± 1.96 5.23/ 48 = 54.7 ± 1.5 = (53.2, 56.2)

I The final result is 53.2 < µ < 56.2

Levine STAT 511


Some Remarks

I The choice of n to be considered ”large enough” is different


according to different textbooks. The most common is
n > 40.
I In the case of large-sample confidence interval the choice of
sample size is more difficult. The reason is that you do not
know s before you actually sample your data.
I The best solution is to try to guess s and to err on the side of
caution by choosing larger s.

Levine STAT 511


A Confidence Interval for a Population Proportion
I Assume X is the number of ”successes” in the sample of size
n. Denote p the proportion of successes in the overall
population.
I If n is small compared to the population size, X is binomial
with mean np and variance np(1 − p).
I The natural estimator of p is p̂ = X /n, the sample fraction of
success
I p̂ has an approximately normal
p distribution; its mean is p and
the standard deviation is p(1 − p)/n.
I Standardizing, we have
!
p̂ − p
P −zα/2 < p < zα/2 ≈ 1 − α
p(1 − p)/n
I As before, the confidence interval can be easily derived
replacing < by = and solving the quadratic equation for p.
I The resulting confidence interval is (a, b) where
q
p̂ + z 2 /2n p̂q̂/n + z 2 /4n2
Levine− z STAT 511
I b is q
2 /2n + z
p̂ + zα/2 2 /4n2
p̂q̂/n + zα/2
α/2
b= 2 )/n
1 + (zα/2
p
I For large n, approximation p̂ ± zα/2 p̂q̂/n was traditionally
used.
I However, its use is not recommended now due to problems
concerning true coverage

Levine STAT 511


Levine STAT 511
Example

I In n = 48 trials in a laboratory, 16 resulted in ignition of a


particular type of substrate by a lighted cigarette. Let p
denote the probability of ”success” (long-run proportion). A
16
point estimate for p is p̂ = 48 = .333. A confidence interval
for p with a confidence level of about 95% is
p
.333 + (1.96)2 /96 ± 1.96 (.333)(.667)/48 + (1.96)2 /9216
1 + (1.96)2 /48
= (.217, .474)

Levine STAT 511


One-sided confidence intervals

I As an example, a reliability engineer may want only a lower


confidence bound for the true average lifetime of a certain
component.
I To derive a 100(1 − α)% one-sided CI, we use
 
X̄ − µ
P √ < zα ≈ 1 − α
S/ n

Levine STAT 511


I Thus, the large-sample upper confidence bound is
s
µ < x̄ + zα · √
n

while the large-sample lower confidence bound for µ is


s
µ > x̄ − zα · √
n

Levine STAT 511


Example I

I In order to claim that a gas additive increases mileage, an


advertiser must fund an independent study in which n vehicles
are tested to see how far they can drive, first without and
then with the additive. Let Xi denote the increase in mpg
observed for vehicle i and let µ = E Xi .
I A large corporation makes an additive with µ = 1.01 mpg.
The respective study involves n = 900 vehicles in which
x̄ = 1.01 and s = 0.1 are observed
x̄−µ
√0
I If the significance level is α = 0.05, t = s/ n
< 1.645 and the
one-sided CI is
0.1
µ0 > 1.01 − 1.645 √ = 1.0045
900

Levine STAT 511


Example II

I An amateur automotive mechanic invents an additive that


increases mileage by an average of µ = 1.21 mpg. The
mechanic funds a small study of n = 9 vehicles in which
x̄ = 1.21 and s = 0.4 are observed.
I The one-sided CI is
0.4
µ0 > 1.21 − 1.645 √ = 0.9967
9

Levine STAT 511

You might also like