SAMPLING TECHNIQUES IN
PUBLIC HEALTH RESEARCH
Dr. Muhammad Asif
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)
Sample:
A subset of the population
Target Population:
The population to be studied/ to which the
investigator wants to generalize his results
Sampling Unit:
smallest unit from which sample can be
selected
Sampling frame
List of all the sampling units from which sample
is drawn
SAMPLING
• A sample is “a smaller (but hopefully
representative) collection of units from a
population used to determine truths about that
population” (Field, 2005)
• Why sample?
– Resources (time, money) and workload
– Gives results with known accuracy that can be
calculated mathematically
• The sampling frame is the list from which the
potential respondents are drawn
– Registrar’s office
– Class rosters
– Must assess sampling frame errors
SAMPLING……
• What is your population of interest?
• To whom do you want to generalize your
results?
– All doctors
– School children
– Indians
– Women aged 15-45 years
– Other
• Can you sample the entire population?
Reasons for Sampling
• Sampling can save money.
• Sampling can save time.
• For given resources, sampling can
broaden the scope of the data set.
• Because the research process is
sometimes destructive, the sample can
save product.
• If accessing the population is
impossible; sampling is the only option.
Reasons for Taking a Census
• Eliminate the possibility that a random
sample is not representative of the
population.
• The person authorizing the study is
uncomfortable with sample information.
Random vs Nonrandom Sampling
• Random sampling
• Every unit of the population has the same
probability of being included in the sample.
• A chance mechanism is used in the selection
process.
• Eliminates bias in the selection process
• Also known as probability sampling
• Nonrandom Sampling
• Every unit of the population does not have the same
probability of being included in the sample.
• Open the selection bias
• Not appropriate data collection methods for most
statistical methods
• Also known as nonprobability sampling
Random Sampling Techniques
• Simple Random Sample
• Stratified Random Sample
• Systematic Random Sample
• Cluster (or Area) Sampling
Simple Random Sample
• Number each frame unit from 1 to N.
• Use a random number table or a
random number generator to select n
distinct numbers between 1 and N,
inclusively.
• Easier to perform for small populations
• Cumbersome for large populations
Simple Random Sampling:
Random Number Table
9 9 4 3 7 8 7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 8
5 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 6
8 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 7
8 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 9
6 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 6
5 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 1
8 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3
Simple random sampling
Systematic Sampling
• Convenient and relatively
N
easy to administer k = ,
• Population elements are an n
ordered sequence (at least, where:
conceptually).
n = sample size
• The first sample element is
selected randomly from the N = population size
first k population elements.
k = size of selection interval
• Thereafter, sample elements
are selected at a constant
interval, k, from the ordered
sequence frame.
Systematic sampling
Stratified Random Sample
• Population is divided into nonoverlapping
subpopulations called strata
• A random sample is selected from each
stratum
• Potential for reducing sampling error
• Proportionate -- the percentage of the
sample taken from each stratum is
proportionate to the percentage that each
stratum is within the population
• Disproportionate -- proportions of the strata
within the sample are different than the
proportions of the strata within the
population
Cluster Sampling
• Population is divided into nonoverlapping
clusters or areas
• Each cluster is a miniature, or microcosm,
of the population.
• A subset of the clusters is selected
randomly for the sample.
• If the number of elements in the subset of
clusters is larger than the desired value of
n, these clusters may be subdivided to
form a new set of clusters and subjected to
a random selection process.
Cluster Sampling
Advantages
• More convenient for geographically dispersed
populations
• Reduced travel costs to contact sample elements
• Simplified administration of the survey
• Unavailability of sampling frame prohibits using
other random sampling methods
Disadvantages
• Statistically less efficient when the cluster
elements are similar
• Costs and problems of statistical analysis are
greater than for simple random sampling
Cluster sampling
Section 1 Section 2
Section 3
Section 5
Section 4
Difference Between Strata and
Clusters
• Although Strata and clusters are both non-
overlapping subsets of the population, they
differ in several ways.
• All strata are represented in the sample; but
only a subset of clusters are in the sample.
• With stratified sampling, the best survey results
occur when elements within strata are
internally homogenous. However, with cluster
sampling, the best results occur when elements
within clusters are internally heterogeneous.
Nonrandom Sampling
• Convenience Sampling: sample elements
are selected for the convenience of the
researcher
• Judgment Sampling: sample elements are
selected by the judgment of the researcher
• Quota Sampling: sample elements are
selected until the quota controls are satisfied
• Snowball Sampling: survey subjects are
selected based on referral from other survey
respondents