Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views15 pages

Lecture 10 Sampling

Uploaded by

gaviperez48
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views15 pages

Lecture 10 Sampling

Uploaded by

gaviperez48
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Lecture 10

Sampling
Population:
A population is the collection of all items in which an investigator is interested.
Two types of Population
Target Population:
The complete collection of observations we want to study. Defining the target
population is an important and often difficult part of the study.
Sample Population:
The collection of all possible observation units that might have been chosen in a
sample; the population from which the sample was taken. It is also called survey
population. The sampled population is more restricted than target population. The
chief reason for the difference between target population and sampled population
arises from the non-response and non-coverage.

Sampling Unit:
A sampling unit or simply unit is a well-defined, distinct, and identifiable element
or group of elements on which observation can be made.
Sample:
A sample is a representative part of the population. It is a collection of sampling
units hopefully representative of the total population that one desire to study.

Sample Size:
Sample size is the number of units contained in a sample.
Population Size:
Population size is the number of units, which constitutes the population.

Random Sample:
Any sample selected by a chance mechanism with known chances of selection is
called a random sample.

1
Sampling Frame:
A sampling frame is the list of units or group of units of the population to be sampled,
organized and arranged in such a way that every unit occurs once and only once in
the list and no unit is excluded from the list.

Sample Survey:
The technique of collecting information from a portion of the population is called a
sample survey.

Representative Sample:
A sample that represents the characteristics of the population as closely as possible
is called a representative sample.

Sampling
Sampling is a statistical procedure that is concerned with the selection of the
individual observation; it helps us to make statistical inferences about the population.

Random sampling:
In data collection, every individual observation has equal probability to be selected
into a sample.

Types of Sampling
1. Probability sampling and
2. Non-probability sampling
Probability Sampling

A probability sampling has the characteristics that each element in the population
has a known and non-zero probability of being included in the sample. As a result,
selection biases are possible to be avoided and statistical theory can be employed
to derive the properties of the estimators.

Types of Probability Sampling Method

1. Simple Sampling
2. Systematic Sampling
3. Stratified Sampling

2
4. Cluster Sampling

Non-Probability Sampling

Non-probability sampling is the sampling technique in which some elements of the


population have no probability of getting selected into a sample. It is also called
Non-representative sampling.

Types of Non-Probability Sample Method

1. Quota Sampling
2. Convenience Sampling
3. Judgment Sampling
4. Purposive Sampling
5. Snowball Sampling

Simple Random Sampling


Situation When Simple Random Sampling is Appropriate
Simple random sample is appropriate when the population size is not too large,
sampling frame is available and population units are homogenous with respect to
the characteristics of interest.
Definition of Simple Random Sampling
Simple random sampling is a method of selecting n elements from a population of
size N elements in such as a way that each combination of n elements has the equal
chance or probability of being selected as every other combination.
Let there are N units in the population and we need a random sample of size n (n<N)
units in a sample. The possible number of different combinations of selecting n units
1
from N units are 𝐶𝑛𝑁 . If each sample is selected with equal probability 𝑁, then the
𝐶𝑛
sampling is called simple random sampling.
Example: Let us consider a population of size N = 3 units. We want to select a
sample of size n = 2. The possible number of samples are 𝐶23 = 3. Assume that the
population elements are 2, 4, and 6. The possible samples are (2,4), 2, 6), and (4,6).
Then any one of the samples are selected with probability 1/3 is called a simple
random sampling.

3
Types of Simple Random Sampling
1. Simple Random Sampling with Replacement and
2. Simple Random Sampling Without Replacement.
Simple Random Sampling with Replacement
If a unit is selected and noted and then returned to the population before the next
drawing is made and this procedure is repeated n times, it gives rise to a simple
random sampling of n units and this procedure is called a simple random sampling
with replacement.

Simple Random Sampling without Replacement


In sampling without replacement, each sample unit of the population has only one
chance to be selected in the sample. For example, if one draws a simple random
sample such that no unit occurs more than one time in the sample, the sample is
drawn without replacement.

Methods of Selecting Simple Random Sample


To ensure the randomness in the selection and to make sample as a representative
part of the population, the method selection must be independent of human
judgement as far as possible. We can do it by choosing sample by using the following
three methods:
(1) Lottery Method
(2) Random Number Table Method and
(3) Random Number Generator Software.
Lottery Method
This is a very popular, simple and old method of taking a random sample. Under this
method, all items of the population are numbered or named on separate sheet of
papers of identical size, color and shape. The sheets are then folded and mixed up in
an urn. A blindfold selection is then made of the number of sheets required to
constitute the desired size of the sample. The selection of items thus depends entirely
on chance.

4
Use Random number Table
Several random number tables are available. Some of these are: (i) Kendal and
Smith, (ii) Fisher and Yates, (iii) Tippett, (iv) Snedecor and Cochran, (v) The Rand
Corporation, etc.
The following steps should follow in drawing a simple random sample of n units
from a population of N units by using random number table.
1. Identify and assign N units in the population with numbers 1 to N.
2. Decide on the random number table to be used.
3. Choose an N-digit random number from any point in the random number
table.

Rejection Method
4. If this random number is less than or equal to N, this is our first selected
unit.
5. Move on to the next random number not exceeding N, vertically,
horizontally, or in any other direction systematically and choose our
second unit.
6. If at any stage of our selection, the random number chosen exceeds N,
discard it and choose the next random number.

Remainder Method
In choosing random number following rejection method may involve a large
number of rejections since all values greater than N or equal to 0 appearing in the
random numbers are not considered for selection. To minimize this rejection rate,
we employ remainder method.
4. If this random number is less than or equal to N, this is our first selected unit.
5. Move on to the next random number vertically, horizontally, or in any other
direction systematically and choose our second unit.
6. If at any stage of our selection, the random number chosen exceeds N,
divide the selected random number by N and the remainder of division is
the selected random number.

7. The process stops once we arrive at our desired sample size.


5
8. In case of simple random sample with replacement, all random numbers are
accepted even if repeated more than once. In case of simple random sampling
without replacement, discard the repeated numbers and draw more numbers.

Random Number Generator Software.


Some statistical softwares like Excel, R, SAS etc. have inbuilt functions for drawing
a sample using simple random sample with/without replacement.
Advantages of Simple Random Sampling
1. Lack of Bias: It is a fair method of sampling and if applied appropriately it
helps to reduce any bias involved.
2. Simplicity: This is a very basic method of collecting data. No technical
knowledge is required to select simple random sampling.
Disadvantages of Simple Random Sampling
1. Costly and Time Consuming:
(a) It is a costly method of sampling as it requires a complete list of all
elements in the population (sampling frame).
(b) Sampled individuals may be so widely dispersed that visiting each selected
individual may be extremely expensive and time consuming.
2. Bias: Certain sub-groups in the population may be totally overlooked or may
be over-representative in the sample as a result of chance factor. In either case,
bias can occur.
Sample Size Determination
Sample Size for Estimating Population Proportion
For large population, the size of the representative sample for proportion is
2
𝑧𝛼⁄
𝑝𝑞
2
𝑛0 =
𝑑2

where p is the estimated proportion of an attribute that is present in the population


q=1–p
𝑧𝛼2⁄ is abscissa of the normal curve that cuts off an area 𝛼/2 at the right tail
2

and d is the desired level of precision


6
If the population is small then the sample size can be reduced slightly as
𝑛0
𝑛= (𝑛0 −1)
1+
𝑁

where n is the sample size and N is the population size.


Sample Size for Estimating Population Mean
For Prespecified population Variance
For large N, the first approximation of n is
2
𝑧𝛼⁄
𝜎2
2
𝑛0 =
𝑑2

where 𝜎 2 is the population variance (which is prespecified).

For small sample compared to 𝑛0


𝑛0
𝑛= 𝑛
1+ 0
𝑁

For Prespecified Cost


Let an amount of money C is being allotted for collecting n observations, 𝐶0 be the
overhead cost and 𝐶1 be the cost of collecting observation for one sample unit. Then
the total cost C can be expressed as
𝐶 = 𝐶0 + 𝑛𝐶1
𝐶−𝐶0
Or 𝑛=
𝐶1

Stratified Sampling

When is Stratified Sampling Used?

➢ If the population is heterogeneous with respect to the characteristic under


study, then adoption of stratified sampling would give representative
sampling.

7
Definition of Stratified Sampling
Stratified random sampling is a sampling plan in which the population is divided
into several non-overlapping strata and selects a random sample from each stratum
in such a way that units within the strata are homogenous but between strata they are
heterogeneous.
• Strata are generally formed on the basis of some known characteristics of the
population, which is believed to be related to the variable of interest.

How to Perform Stratified Sampling


The process for performing stratified sampling is as follows:

➢ Divide the whole heterogeneous population into several smaller distinct, non-
overlapping subpopulations, or strata, such that the sampling units are
homogeneous with respect to the characteristic under study within the strata
and heterogeneous with respect to the characteristic under study
between/among the strata.
➢ Treat each stratum as separate population and draw a sample by simple
random sampling from each stratum.
➢ Sampling within each stratum may be selected proportionately or equal in size
from each stratum. A proportionate stratified sample is achieved if the
sampling fraction is the same for every stratum. Under this design, the sample
size in a stratum is proportional to the size of the population in the stratum.
➢ For each individual stratum, stratum mean, proportion, variance, and other
characteristics are computed.
➢ These estimates are then properly weighted to form a combined estimate for
the entire population.

Principles of Stratification

➢ The strata should be non-overlapping and exhaustive so that they together


comprise the whole population. The strata should be made as homogenous as
possible within strata and heterogeneous between strata.
➢ Strata are to be formed on the basis of some known characteristics of the
population, which are believed to have some relationship with the subject of
inquiry and variables of interest.
➢ When stratification with respect to the characteristics under study becomes
difficult for practical reasons, administrative convenience may be considered
as the basis for forming the strata.

8
➢ With a view to improve the sampling design, strata should be formed on the
basis of natural characteristics as far as possible.
➢ Past data, intuition, expert judgment or preliminary findings from pilot
surveys may also be used to set-up the strata. This, however, requires that we
have prior knowledge of the nature of the population from which we are
sampling.

The principal reasons for using stratified random sampling rather than simple
random sampling include:

1. Stratification may produce a smaller error of estimation than would be


produced by a simple random sample of the same size. This result is
particularly true if measurements within strata are very homogeneous.
2. The cost per observation in the survey may be reduced by stratification of the
population elements into convenient groupings.
3. Estimates of population parameters may be desired for subgroups of the
population. These subgroups should then be identified.

Steps Involved in Stratified Sampling


1. Choice of stratification variable,
2. Formation of strata,
3. Number of strata,
4. Sampling within the strata, and
5. Allocation of sample to strata.

Advantages:
➢ The stratification can improve the efficiency of estimation under appropriate
conditions.
➢ The stratification may be administratively convenient and facilitate the
drawing of a sample.
➢ We may want to estimate characteristics of the separate strata as well as of
the overall population.

Disadvantages:
➢ Requires ancillary information
➢ Can be more time consuming to plan and implement

9
Notations
We use the following symbols and notations:
N : Population size
k : Number of strata
Ni : Number of sampling units in ith strata
𝑁 = ∑𝑘𝑖=1 𝑁𝑖 = Total population size
ni : Number of sampling units to be drawn from ith stratum
𝑛 = ∑𝑘𝑖=1 𝑛𝑖 = Total sample size

Population (N units)

Stratum 1 Stratum 2 Stratum k 𝑘

𝑁 = ∑ 𝑁𝑖
N1 units N2 units ………….. Nk units 𝑖=1

Sample Sample 𝑘
Sample k
2 𝑛 = ∑ 𝑛𝑖
1
𝑖=1
n2 units nk units
n1 units

10
Allocation Problem and Choice of Sample Sizes in Different Strata
There are two aspects of choosing the sample sizes:
i. Minimize the cost of survey for a specified precision
ii. Maximize the precision for given cost.
➢ The sample size cannot be determined by minimizing both cost and variability
simultaneously. The cost function is directly proportional to the sample size
whereas variability is inversely proportional to the sample size.

Equal Allocation

Choose the sample size 𝑛𝑖 to be the same for all the strata. Let n be the sample
size and k be the number of strata, then
𝑛
𝑛𝑖 = for all i=1,2,………..,k.
𝑘

Proportional Allocation

For fixed k, select 𝑛𝑖 such that it is proportional to stratum size 𝑁𝑖 , i.e.,


𝑛𝑖 ∝ 𝑁𝑖
𝑛
𝑛𝑖 = ( ) 𝑁𝑖
𝑁

Neyman or Optimum Allocation

This allocation considers the size of strata as well as variability


𝑛𝑖 ∝ 𝑁𝑖 𝑆𝑖 and
𝑛𝑁𝑖 𝑆𝑖
𝑛𝑖 = ∑𝑘
𝑖=1 𝑁𝑖 𝑆𝑖

This allocation arises when the 𝑉𝑎𝑟(𝑦


̅̅̅̅)
𝑠𝑡 is minimized subject to the constraint
𝑘
∑𝑖=1 𝑛𝑖 (prespecified).
➢ There are some limitations of the optimum allocation. The knowledge of 𝑆𝑖 is
needed to know 𝑛𝑖 .

11
Cluster Sampling
Situations in which cluster sampling is used
In many practical situations and many types of populations, a list of elements is not
available and so the use of an element as a sampling unit is not feasible. The method
of cluster sampling or area sampling can be used in such situations.
Definition: it is one of the assumptions in any sampling procedure that the
population can be divided into a finite number of distinct and identifiable units,
called sampling units. The smallest units into which the population can be divided
are called elements of the population. The groups of such elements are called
clusters.
In cluster sampling
• Divide the whole population into clusters according to some well defined
rule.
• Treat the clusters as sampling units.
• Choose a sample of clusters according to some procedure.
• Carry out a complete enumeration of the selected clusters, i.e. collect
information on all the sampling units available in selected clusters.
Area Sampling
In case, the entire area containing the populations is subdivided into smaller area
segments and each element in the population is associated with one and only one
such area segment, the procedure is called an area sampling.
Conditions Under Which the Cluster Sampling is used
Cluster sampling is preferred when
i. No reliable listing of elements is available and it is expensive to prepare it.
ii. Even if the list of elements is available, the location or identification of the
units may be difficult.
iii. A necessary condition for the validity of this procedure is that every unit
of the population under study must correspond to one and only one unit of
the cluster so that the total number of sampling units in the frame may
cover all the units of the population under study without any omission or
duplication. When this condition is not satisfied, bias is introduced.

12
Construction of Clusters:
The clusters are constructed such that the sampling units are heterogenous within the
clusters and homogenous among the clusters. This is opposite to the construction of
the strata in the stratified sampling. There are two options to construct the clusters—
equal size and unequal size.

Case of Equal Clusters


Layout of NM Population Elements in Clusters
Clusters
Elements 1 2 3 ……. i …… N
1 𝑦11 𝑦21 𝑦31 𝑦𝑖1 𝑦𝑁1
2 𝑦12 𝑦22 𝑦32 𝑦𝑖2 𝑦𝑁2
3 𝑦13 𝑦23 𝑦33 𝑦𝑖3 𝑦𝑁3

j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 𝑦𝑖𝑗 𝑦𝑁𝑗

M 𝑦1𝑀 𝑦2𝑀 𝑦3𝑀 𝑦𝑖𝑀 𝑦𝑁𝑀


Cluster 𝑦1 𝑦2 𝑦3 𝑦𝑖 𝑦𝑁
total
Cluster 𝑦̅1 𝑦̅2 𝑦̅3 𝑦̅𝑖 𝑦̅𝑁
Mean

• Suppose the population is divided into N clusters and each cluster is of size
M.
• Select a sample of n clusters from N clusters by the method of SRS, generally
WOR.
Case of Unequal Clusters
In practice, the equal size of clusters is available only when planned. For example,
in a screw manufacturing company, the packets of screws can be prepared such that
every packet contains same number screws. In real applications, it is hard to get
clusters of equal size. For example, the villages with equal areas are difficult to find,
the districts with same number of persons are difficult to find, the number of
members in a household may not be same in each household in a given area.

13
Different stages of cluster sampling

Systematic Sampling
• The systematic sampling technique is operationally more convenient than the
simple random sampling.
• It also ensures at the same time that each unit has equal probability of
inclusion in the sample.
• In this method of sampling, the first unit is selected with the help of random
numbers and the remaining units are selected automatically according to a
predetermined pattern.

Systematic Sampling
‐Linear systematic sampling
‐Circular systematic sampling
Linear Systematic Sampling
• Systematic Sampling (SYS), like SRS, involves selecting n sampling units
from a population of N units
• Instead of randomly choosing the n units in the sample, a skip pattern is run
through a list (frame) of the N units to select the sample
• The skip or sampling interval, k=N/n
Selection Procedure
To draw a sample of size n,
1. Form a sequential list of population units
2. Decide on a sample size n and compute the skip (sampling interval), k=N/n
3. Choose a random number, r (random start) between 1 and k (inclusive)
4. Add “k” to selected random number to select the second unit and continue to
add “k” repeatedly to previously selected unit number to select the
remainder of the sample.
So, first unit is selected at random and other units are selected systematically. This
systematic sample is called kth systematic sample and k is termed as sampling
interval. This is also known as linear systematic sampling.
14
What to do when N≠nk
One of the following possible procedures may be adopted when N≠nk :
(i) Drop one unit at random if sample has (n + 1) units.
(ii) Eliminate some units so that N = nk.
(iii) Adopt circular systematic sampling scheme.
(iv) Round off the fractional interval k.
Circular Systematic Sampling
Selection procedure
1. Determine the interval k– rounding down to the integer nearest to N/n (If
N= 15 and n= 4, then k is taken as 3 and not 4)
2. Take a random start between 1 and N
3. Skip through the circle by k units each time to select the next unit until n
units are selected
4. Thus, there could be N possible distinct samples instead of k.

Advantages of systematic sampling:


1. It is easier to draw a sample and often easier to execute it without mistakes.
2. The cost is low and the selection of units is simple. Much less training is
needed for surveyors to collect units through systematic sampling.
3. The systematic sample is spread more evenly over the population. So, no
large part will fail to be represented in the sample. The sample is evenly
spread and cross section is better. Systematic sampling fails in case of too
many blanks.
4. Likely to be more efficient than SRSWOR, particularly when ordered by
characteristics related to variable of interest.

Disadvantages of systematic sampling:


1. Requires complete list of the population.
2. A bad arrangement of the units may produce a very inefficient sample.

15

You might also like