Confidence Intervals
Looking Forward to Inference
Alina Kuvelkar
Week 11
Statistical Inference
We want to learn about this We only have this to work with
Statistic
Parameter
Identifying the Parameter and the Statistic
We talked earlier about how we have different notation for parameters and statistics:
Confidence Intervals
Sampling Distribution
We’ve seen that statistics can
vary from sample to sample.
Because of this, we usually give a
range of plausible values for the
population parameter rather than
just a single best estimate
Confidence Intervals
Confidence Interval:
An interval computed from sample data by a method that will capture
the parameter for a specified proportion of all samples.
Sample Statistic ± Margin of Error
Our best estimate
for the proportion
Example
Suppose the results of an election poll
show the proportion supporting a
particular candidate is 𝑝Ƹ = 0.54.
If the margin of error is 0.02, what is
the interval for plausible values of 𝑝?
0.54 ± 0.02
Example
Suppose the results of an election
poll show the proportion supporting
a particular candidate is 𝑝Ƹ = 0.54.
Can we be reasonably sure that this
candidate will win the majority of
votes and win the election?
0.52 𝑡𝑜 0.56
Yes!
Example
Suppose the results of an election poll
show the proportion supporting a
particular candidate is 𝑝Ƹ = 0.54.
If the margin of error is 0.10, what is
the interval for plausible values of 𝑝?
Example
Suppose the results of an election poll
show the proportion supporting a
particular candidate is 𝑝Ƹ = 0.54.
If the margin of error is 0.10, what is
the interval for plausible values of 𝑝?
0.54 ± 0.10
0.44 𝑡𝑜 0.64
Example
Suppose the results of an election poll
show the proportion supporting a
particular candidate is 𝑝Ƹ = 0.54.
Can we be reasonably sure that this
candidate will win the majority of
votes and win the election?
0.44 𝑡𝑜 0.64
No!
Confidence Intervals
The larger your confidence level is, the wider your interval will be!
Confidence Level:
The success rate (proportion of all samples whose intervals contain the
parameter)
95% Confidence Intervals
Recall that for a symmetric, bell-shaped distribution, roughly 95% of the
values fall within two standard deviations of the center
specific to a 95%
confidence level
Sample Statistic ± 𝟐 ∗ Standard Error
Margin of Error
Example
Body mass index (BMI) is a calculation that
measures body weight relative to height.
Let’s say we took a random sample of US
adults and the sample mean BMI was
𝑥ҧ = 27.655 with 𝑆𝐸 = 0.009
Give a 95% confidence interval for the
average BMI for all adults living in the
US, and interpret this interval.
Example
Body mass index (BMI) is a calculation that
measures body weight relative to height.
Let’s say we took a random sample of US
adults and the sample mean BMI was
𝑥ҧ = 27.655 with 𝑆𝐸 = 0.009
27.655 ± 𝟐 ∗ 0.009
Example
Body mass index (BMI) is a calculation that
measures body weight relative to height.
Let’s say we took a random sample of US
adults and the sample mean BMI was
𝑥ҧ = 27.655 with 𝑆𝐸 = 0.009
27.637 𝑡𝑜 27.673
confidence interval for 𝜇
Example
Let’s say we took a random sample of US
adults and the sample mean BMI was
𝑥ҧ = 27.655 with 𝑆𝐸 = 0.009
27.637 𝑡𝑜 27.673
We are 95% sure that the mean BMI for all
adults living in the US in 2010 is between
27.637 and 27.673.
95% Confidence Intervals
Confidence Level:
The success rate (proportion of
all samples whose intervals
contain the parameter)
Over the long run, for many such
intervals, about 95% will successfully
contain the parameter, while about
5% will miss it.
Common Misinterpretations
Misinterpretation 1
A 95% confidence interval contains 95% of the data in the population.
The correct statement is that we are 95% confident that the
population mean is in the interval.
Common Misinterpretations
Misinterpretation 2
I am 95% sure that the mean of a sample will fall within a 95%
confidence interval for the mean.
The correct statement is that we are 95% sure that the mean of the
population will fall within a 95% confidence interval for the mean.
Common Misinterpretations
Misinterpretation 3
The probability that the population parameter is in this particular
95% confidence interval is 0.95.
Remember that what varies are the statistics from sample to sample,
not the population parameter.
Confidence Intervals
Confidence Interval:
An interval computed from sample data by a method that will capture
the parameter for a specified proportion of all samples.
Confidence Level:
The success rate (proportion of all samples whose intervals contain the
parameter)
95% Confidence Intervals
This was the formula we introduced for a 95% confidence interval:
What if we want a different confidence level?
specific to a 95%
confidence level
Sample Statistic ± 𝟐 ∗ Standard Error
Margin of Error
95% Confidence Intervals
Sample Statistic ± 𝟐 ∗ Standard Error
Area = 0.95
Standard
Normal Based on the
Distribution Empirical Rule
−𝟐 𝟐
90% Confidence Intervals
Sample Statistic ± ? ∗ Standard Error
Area = 0.90
Standard
Normal
Distribution
-𝑧 0 𝑧
Confidence Intervals Using the Normal Distribution
If the sampling distribution follows the shape of a normal distribution with
standard error SE, we find the confidence interval for the parameter using
Margin of Error
Sample Statistic ± 𝑧 ∗ ∙ Standard Error
chosen such that the area between
−𝑧 ∗ and 𝑧 ∗ in the standard normal
distribution is the desired level of confidence
Confidence Intervals Using the Normal Distribution
Sample Statistic ± 𝑧 ∗ ∙ Standard Error
Area = 0.90
Area = 0.05 Area = 0.05
-𝑧 0 𝑧
Standard Normal Distribution
Using R to find the z* values
We need to enter the
probabilities to the
left in the R function
z
90% Confidence Interval
Sample Statistic ± 1.645 ∙ Standard Error
Area = 0.05
Area = 0.90
0 1.645
-1.645
90% Confidence Interval
Sample Statistic ± 1.645 ∙ Standard Error
Area = 0.95
0 1.645
-1.645
Confidence Intervals Using the Normal Distribution
Below are the normal percentiles for common confidence intervals:
Example
A survey conducted in July 2015 asked a
random sample of n = 2001 American adults
whether they had ever used online dating.
The observed sample statistic is 𝑝Ƹ = 0.15, and
the standard error is 0.008
Use the standard normal distribution to find an
80% confidence interval for the proportion of
American adults who have used online dating.
Example
The observed sample statistic is 𝑝Ƹ = 0.15, and
the standard error is 0.008
Use the standard normal distribution to find an
80% confidence interval for the proportion of
American adults who have used online dating.
Sample Statistic ± 𝑧 ∗ ∙ Standard Error
𝑝Ƹ ± 𝑧 ∗ ∙ Standard Error
0.15 ± 𝑧 ∗ ∙ 0.008
Confidence Intervals Using the Normal Distribution
0.15 ± 1.282 ∙ 0.008
Area = 0.10 Area = 0.80
-1.282 0 1.282
Standard Normal Distribution
Example
The observed sample statistic is 𝑝Ƹ = 0.15, and
the standard error is 0.008
Use the standard normal distribution to find an
80% confidence interval for the proportion of
American adults who have used online dating.
𝟎. 𝟏𝟑𝟗𝟕 𝐭𝐨 𝟎. 𝟏𝟔𝟎𝟑
We are 80% sure that the proportion of all adults
living in the US using online dating in 2015 is
between 0.1397 and 0.1603.
Example
The observed sample statistic is 𝑝Ƹ = 0.15, and
the standard error is 0.008
Use the standard normal distribution to find a
99% confidence interval for the proportion of
American adults who have used online dating.
Sample Statistic ± 𝑧 ∗ ∙ Standard Error
𝑝Ƹ ± 𝑧 ∗ ∙ Standard Error
0.15 ± 𝑧 ∗ ∙ 0.008
Confidence Intervals Using the Normal Distribution
0.15 ± 2.576 ∙ 0.008
Area = 0.99
Area = 0.005
-2.576 0 2.576
Standard Normal Distribution
Example
The observed sample statistic is 𝑝Ƹ = 0.15, and
the standard error is 0.008
Use the standard normal distribution to find a
99% confidence interval for the proportion of
American adults who have used online dating.
𝟎. 𝟏𝟐𝟗𝟒 𝐭𝐨 𝟎. 𝟏𝟕𝟎𝟔
We are 99% sure that the proportion of all adults
living in the US using online dating in 2015 is
between 0.1294 to 0.1706.
Confidence Intervals Using the Normal Distribution
If the sampling distribution follows the shape of a normal distribution with
standard error SE, we find the confidence interval for the parameter using
Margin of Error
Sample Statistic ± 𝑧 ∗ ∙ Standard Error
Note: We need to make sure the Central Limit Theorem applies.