Calculating the Sample Size n
If researchers desire a specific margin of error, then they can use the error bound
formula to calculate the required sample size.
The error bound formula for a population mean when the population standard
deviation is known
o is
2
(
!"# = (&! )( ) EBM
" )
or
Zz
2
EBM 2
q
The formula for sample size is
n.EE 22z
Bm EB
m2 6I
2
(&! )" ( " 2 6
" n
)=
!"#" EBMZ
found by solving the error bound formula for ).
A researcher planning a study who wants a specified confidence level and error bound
can use this formula to calculate the size of the sample needed for the study.
1
Example
known
y
The population standard deviation for the height of high school basketball players is
three inches. If we want to be 95% confident that the sample mean height is within
e
one inch of the true population mean height, how many randomly selected students
must be surveyed? T
3M
6
95
3
Zo ozs l 96
I 0.95 0.025
2 pass
EBM _I
a 43,2
n
z 22 34.574 mi
Always ROUND UP to the next integer To make sure that the sample
size is largeenough
35J
n
2
8.2
A Single Population Mean Using
the Student t Distribution
Student’s t-distribution
If you draw a simple random sample of size ) from a population that has an
approximately normal distribution with mean * and unknown population standard
̅
$%&
deviation ( and calculate the t-score + = ! , then the t-scores follow a Student's
7
O The t-score has the same
"
t-distribution with , – . degrees of freedom.
interpretation as the z-score. It measures how far 0̅ is from its mean *. For each
sample size ), there is a different Student's t-distribution.
The degrees of freedom, , – ., come from the calculation of the sample standard
deviation 1. We use ) deviations (0 − 0̅ values) to calculate 1. Because the sum of the
deviations is zero, we can find the last deviation once we know the other , − .
deviations. The other , – . deviations can change or vary freely. We call the
number , – . the degrees of freedom (34).
9 10 11 Or 12 10 8
4
Properties of Student’s t-distribution
• The graph for the Student's t-distribution is similar to the standard normal curve.
• The mean for the Student's t-distribution is zero and the distribution is
symmetric about zero.
• The Student's t-distribution has more probability in its tails than the standard
normal distribution because the spread of the t-distribution is greater than the
spread of the standard normal. So the graph of the Student's t-distribution will
be thicker in the tails and shorter in the center than the graph of the standard
normal distribution.
• The exact shape of the Student's t-distribution depends on the degrees of
freedom. As the degrees of freedom increases, the graph of Student's t-
distribution becomes more like the graph of the standard normal distribution.
• The underlying population of individual observations is assumed to be normally
distributed with unknown population mean * and unknown population standard
deviation (. The size of the underlying population is generally not relevant
unless it is very small. If it is bell shaped (normal) then the assumption is met
and doesn't need discussion. Random sampling is assumed, but that is a
completely separate assumption from normality.
5
Student’s t Table
A Student's t table gives t-scores given the degrees of freedom and the right-tailed
probability.
The notation for the Student's t-distribution (using T as the random variable) is:
5 ~ +'( where 78 = ) – 1.
For example, if we have a sample of size n=20 items, then we calculate the degrees
of freedom as 78 = ) − 1 = 20 − 1 = 19 and we write the distribution as 5 ~ +19.
6
Confidence Interval
To construct a confidence interval for a single unknown population mean μ, where
the population standard deviation is unknown,
0̅ − +! ($̅ < * < 0̅ + +! ($̅
" "
7 Sample S D
sample size
7
Example
Let’s assume that you’ve looked at the table in your newspaper that shows the high
temperatures recorded for cities in Canada on the fifth day of August. You select a
random sample of 5 cities in Canada and record the following temperatures:
29, 16, 22, 24, and 31 ℃. Assuming that the population of all temperatures are
normally distributed, construct a %99 confidence interval for the mean temperature
recorded for all Canadian communities on that date. The
population SD is unknown
Sample S D
2
I
4 1251 24.4
X F X F
21 16 S
35. 16
2g 4.6
8.4 70 56 5.9413J
22 2.4 5.76
G 2
657 .3
24 0.4 0.16
43 56
6.6
2
X E 141.2 8
X 122
o
e o
o
E 9
2
6
I
tq6 cus Fitz
4.604
2.65744µL 24
24.4 4 2 6571
4.604
36.6332
12.1667 4µL
Interpretation
We estimate with confident that the true population mean
ggy
5th is between 12.1667
temperature for all Canadian cities on Aug
and 36.6332 C
10
Homework
You do a study of hypnotherapy to determine how effective it is in increasing the
number of hours of sleep subjects get each night. You measure hours of sleep for 12
subjects with the following results. Construct a 95% confidence interval for the mean
number of hours slept for the population (assumed normal) from which you took
the data.
8.2; 9.1; 7.7; 8.6; 6.9; 11.2; 10.1; 9.9; 8.9; 9.2; 7.5; 10.5
(No need to hand in)
11