Module 3 Notes
Module 3 Notes
Module: 3 Measurement
Meaning
Measurement is the process observing and recording the observations that are collected as part
of a research effort. Measurement is the foundation of all scientific investigation. It may be
defined as the assignment of numbers to characteristics of objects or events according to rules.
It is important to note that is the characteristics of objects or events that are measured and not
the objects or events themselves. Measurement defines “measurement is the assignment of
numbers to objects” it means measuring the personality characteristics by assigning a number”
(a score on the test) to an object (a Person). For example, we can measure buyers’ preference,
income and attitude and products’ wetness. A business man engaged in marketing the products
is interested in measuring the market potential for a new product, buyer’s attitude, perceptions
or preferences towards a new brand. This measurement process gives scope for providing
meaningful information for decision making.
Concept of Measurement
Measurement is the process by which the organization observes and records the observations
that have been gathered as a result of some research activities, In other words, measurement
can be defined as "the process of mapping the aspects of an area onto some other areas as per
some rules". For example: Researchers may wish to measure the percentage of people who
make purchases the products for their company. For measurement, a scale is needed to be
developed in the range that refers to a particular set, according to set theory. after this, mapping
is carried-out on the observations, which are based on defined scale.
What is Measured?
1) Direct Observables:
The things which can directly be observed are called direct observables. For example, by
meeting an individual the brand of his/her wrist watch can be directly observed.
2) Indirect Observables:
The things which cannot be directly observed are called 'indirect observables'. More complex
and refined observation efforts is required for observing such things. For example, minutes of
earlier board meetings of corporations can be used to observe past business decisions.
3) Constructs :
The things which cannot be observed directly or indirectly are called 'constructs'. These are the
theoretical concepts, which are developed by observing different aspects of an operation. For
example, IQ is known as a construct. It cannot be directly or indirectly observed. It is
determined only by mathematically observing answers of different test questions asked in an
IQ test
Process of Measurement
After successfully categorizing the events, the next step in measurement process is selecting
appropriate calculation method for different behavioral categories. The different calculation
methods are as follows :
a) Frequency Method : In this method, frequency counts are used for calculations.
Number of occurrences of a particular event in a definite time period is called its
frequency count. Behaviors or events which occur number of times within a definite
time period, occur for short duration's, or have a sharp beginning or ending, are
calculated with the help of frequency method.
The final step in measurement process is using multiple observations so as to measure inter-
rater reliability. Observations used in measurement process art as follows :
b) Participant Observation : In this type of observation, the researcher joins the group
of participants as an individual participant and therefore, observes their behavior.
Validity is the most important criterion and measures the degree to which the instrument
measures what it is required to measure. It can also be considered as the utility of the
instrument. It measures the extent to which the differences in the test measurements reflect the
actual differences. For example, if a researcher is trying to examine the motivation level of
employees, then he needs to look into a variety of other factors as well. Say, if he considers
only absenteeism, then it is not a valid measure as absenteeism may also happen due to other
causes like illness, personal reasons, family reasons, etc.
1) Face Validity :
Face validity is the easiest among the various types of validity. It measures one single item on
a lest or all the items and tries to measure how well the item expresses the meaning of the rest.
For example, the test item "I think I should buy a car" is an example of item which has face
validity as item measures intention to purchase a car. The downside of face validity is that
respondents can often hide their responses or exaggerate their responses so that the response
becomes manipulated. In fact, many psychornetricians like tests which do not have face validity
but general validity. Test items which measure what they are supposed to measure but have no
face validity, will be more difficult to manipulate by the respondents. Though items which do
not have face validity have some good features, but in the long-run, it is better if the test items
have some face validity.
2) Content Validity :
Content validity refers to how adequate the selected variables are measuring the requirements.
In other words, the scale that is being used should have all the required variables. For example,
if a researcher wants to test the facilities of a hotel and if it includes the variables like locality,
number of old customers, number of new customers, turnover, etc., then, it is clear that this
scale will not have content validity as these variables are not adequate to answer the research
objective. Instead, the researcher should include variables like the ambience, stalls, food,
cleanliness, maintenance, medical facilities, etc. The selection and choice of research variables
which are to be included in the seal for the purpose of the research activity, is a very difficult
task.
The criterion validity measures how well the data collected by the scale employed in the
research work corresponds with the criterion variables. Criterion variables can be in the form
of demographic and psychographic data attitude and behavior variables, or which have been
derived from other scale criterion validity can be of two types depending on the time period of
assessment :
ii) Predictive Validity : In predictive validity, the researcher gathers data on the scale
and the criterion variable al different points of time.
4) Construct Validity :
In construct validity, the researcher tries to examine the characteristic or construct that the scale
is measuring. When measuring construct validity, the researcher tries to answer questions like
how the scale works and what kind of conclusions can be made regarding the research being
carried out. Construct validity is die most difficult and sophisticated kind of validity. It includes
the following :
i) Convergent Validity : Convergent validity measures how well the scale converges
or correlates with other measures of the same construct.
ii) Discriminant Validity : This measures the opposite of the convergent validity. In
measures how different the scale is from other measures of the same construct. It
seeks to show the lack of correlation between various constructs.
iii) Nomological Validity : Nomological validity measures the degree to which the
scale correlates to other measures of different but related constructs in theoretically
predictable way. In this method, theoretical model is developed which directs to
further tests, deductions and inferences. This leads to the construction of a
nomological net, in which several constructs can be related with each other
Issue of preciseness and practical use of the research work are the main concerns for several
researchers. They are curious about the contribution of their research work in the concerned
field. Therefore, they evaluate their work in the light of two main factors, i.e., reliability and
validity. These reliability and validity issues need to be addressed very carefully otherwise they
may lead to defective statistical decisions and errors (Type I and II).
Levels of Measurement
• Ratio: the data can be categorized, ranked, evenly spaced, and has a natural zero.
Depending on the level of measurement of the variable, what you can do to analyze your data
may be limited. There is a hierarchy in the complexity and precision of the level of
measurement, from low (nominal) to high (ratio).
Nominal Scale
A nominal scale is the 1st level of measurement scale in which the numbers serve as “tags” or
“labels” to classify or identify the objects. A nominal scale usually deals with the non-numeric
variables or the numbers that do not have any value.
• A nominal scale variable is classified into two or more categories. In this measurement
mechanism, the answer should fall into either of the classes.
• The numbers don’t define the object characteristics. The only permissible aspect of
numbers in the nominal scale is “counting.”
Example:
M- Male
F- Female
Here, the variables are used as tags, and the answer to this question should be either M or F.
Ordinal Scale
The ordinal scale is the 2nd level of measurement that reports the ordering and ranking of data
without establishing the degree of variation between them. Ordinal represents the “order.”
Ordinal data is known as qualitative data or categorical data. It can be grouped, named and also
ranked.
• Along with the information provided by the nominal scale, ordinal scales give the
rankings of those variables
• The surveyors can quickly analyse the degree of agreement concerning the identified
order of variables
Example:
• Ratings in restaurants
• Very often
• Often
• Not often
• Not at all
• Totally agree
• Agree
• Neutral
• Disagree
• Totally disagree
Interval Scale
The interval scale is the 3rd level of measurement scale. It is defined as a quantitative
measurement scale in which the difference between the two variables is meaningful. In other
words, the variables are measured in an exact manner, not as in a relative way in which the
presence of zero is arbitrary.
• The interval scale is quantitative as it can quantify the difference between the values
• To understand the difference between the variables, you can subtract the values between
the variables
• The interval scale is the preferred scale in Statistics as it helps to assign any numerical
values to arbitrary assessment such as feelings, calendar types, etc.
Example:
• Likert Scale
Ratio Scale
The ratio scale is the 4th level of measurement scale, which is quantitative. It is a type of
variable measurement scale. It allows researchers to compare the differences or intervals. The
ratio scale has a unique feature. It possesses the character of the origin or zero points.
• It affords unique opportunities for statistical analysis. The variables can be orderly
added, subtracted, multiplied, divided. Mean, median, and mode can be calculated using
the ratio scale.
• Ratio scale has unique and useful properties. One such feature is that it allows unit
conversions like kilogram – calories, gram – calories, etc.
Example:
• 55 – 75 kgs
• 76 – 85 kgs
• 86 – 95 kgs
Research population and sample serve as the cornerstones of any scientific inquiry. They hold
the power to unlock the mysteries hidden within data. Understanding the dynamics between
the research population and sample is crucial for researchers. It ensures the validity, reliability,
and generalizability of their findings. In this article, we uncover the profound role of the
research population and sample, unveiling their differences and importance that reshapes our
understanding of complex phenomena. Ultimately, this empowers researchers to make
informed conclusions and drive meaningful advancements in our respective fields.
What Is Population?
The research population, also known as the target population, refers to the entire group or set
of individuals, objects, or events that possess specific characteristics and are of interest to the
researcher. It represents the larger population from which a sample is drawn. The research
population is defined based on the research objectives and the specific parameters or attributes
under investigation. For example, in a study on the effects of a new drug, the research
population would encompass all individuals who could potentially benefit from or be affected
by the medication.
When the research population is small or easily accessible, it may be feasible to collect data
from the entire population. This is often the case in studies conducted within specific
organizations, small communities, or well-defined groups where the population size is
manageable.
If the research focuses on a specific characteristic or trait that is rare and critical to the study,
collecting data from the entire population may be necessary. This could be the case in studies
related to rare diseases, endangered species, or specific genetic markers.
Certain legal or regulatory frameworks may require data collection from the entire population.
For instance, government agencies might need comprehensive data on income levels,
demographic characteristics, or healthcare utilization for policy-making or resource allocation
purposes.
In situations where a high level of precision or accuracy is necessary, researchers may opt for
population-level data collection. By doing so, they mitigate the potential for sampling error and
obtain more reliable estimates of population parameters.
What Is a Sample?
A sample is a subset of the research population that is carefully selected to represent its
characteristics. Researchers study this smaller, manageable group to draw inferences that they
can generalize to the larger population. The selection of the sample must be conducted in a
manner that ensures it accurately reflects the diversity and pertinent attributes of the research
population. By studying a sample, researchers can gather data more efficiently and cost-
effectively compared to studying the entire population. The findings from the sample are then
extrapolated to make conclusions about the larger research population.
Sampling refers to the process of selecting a sample from a larger group or population of
interest in order to gather data and make inferences. The goal of sampling is to obtain a sample
that is representative of the population, meaning that the sample accurately reflects the key
attributes, variations, and proportions present in the population. By studying the sample,
researchers can draw conclusions or make predictions about the larger population with a certain
level of confidence.
Collecting data from a sample, rather than the entire population, offers several advantages and
is often necessary due to practical constraints. Here are some reasons to collect data from a
sample:
Collecting data from an entire population can be expensive and time-consuming. Sampling
allows researchers to gather information from a smaller subset of the population, reducing costs
and resource requirements. It is often more practical and feasible to collect data from a sample,
especially when the population size is large or geographically dispersed.
2. Time Constraints
Conducting research with a sample allows for quicker data collection and analysis compared
to studying the entire population. It saves time by focusing efforts on a smaller group, enabling
researchers to obtain results more efficiently. This is particularly beneficial in time-sensitive
research projects or situations that necessitate prompt decision-making.
Working with a sample makes data collection more manageable. Researchers can concentrate
their efforts on a smaller group, allowing for more detailed and thorough data collection
methods. Furthermore, it is more convenient and reliable to store and conduct statistical
analyses on smaller datasets. This also facilitates in-depth insights and a more comprehensive
understanding of the research topic.
4. Statistical Inference
Collecting data from a well-selected and representative sample enables valid statistical
inference. By using appropriate statistical techniques, researchers can generalize the findings
from the sample to the larger population. This allows for meaningful inferences, predictions,
and estimation of population parameters, thus providing insights beyond the specific
individuals or elements in the sample.
5. Ethical Considerations
In certain cases, collecting data from an entire population may pose ethical challenges, such as
invasion of privacy or burdening participants. Sampling helps protect the privacy and well-
being of individuals by reducing the burden of data collection. It allows researchers to obtain
valuable information while ensuring ethical standards are maintained.
Clearly define the target population for your research study. The population should encompass
the group of individuals, elements, or units that you want to draw conclusions about.
Create a sampling frame, which is a list or representation of the individuals or elements in the
target population. The sampling frame should be comprehensive and accurately reflect the
population you want to study.
Select an appropriate sampling method based on your research objectives, available resources,
and the characteristics of the population. You can perform sampling by either utilizing
probability-based or non-probability-based techniques. Common sampling methods include
random sampling, stratified sampling, cluster sampling, and convenience sampling.
Determine the desired sample size based on statistical considerations, such as the level of
precision required, desired confidence level, and expected variability within the population.
Larger sample sizes generally reduce sampling error but may be constrained by practical
limitations.
5. Collect Data
Once the sample is selected using the appropriate technique, collect the necessary data
according to the research design and data collection methods. Ensure that you use standardized
and consistent data collection process that is also appropriate for your research objectives.
Perform the necessary statistical analyses on the collected data to derive meaningful
insights. Use appropriate statistical techniques to make inferences, estimate population
parameters, test hypotheses, or identify patterns and relationships within the data.
While the population provides a comprehensive overview of the entire group under study, the
sample, on the other hand, allows researchers to draw inferences and make generalizations
about the population. Researchers should employ careful sampling techniques to ensure that
the sample is representative and accurately reflects the characteristics and variability of the
population.
Sampling Frame
The sampling frame (also known as the “sample frame” or “survey frame”) is indeed the actual
collection of units. A sample has now been taken from this. A basic random sample gives all
units in it an equal probability of being drawn and appearing in the sample. In the ideal scenario,
the sample frame should match the sample of people.
A complete list or collection from which your sample participants will be drawn in a
predetermined manner. The list will be organized in some way. That is, each member of a
population will have an individual identity and a contact mechanism. This allows you to
categorize and code known information about segmentation features.
Collecting your sample indicates that you have a supply or list of all the individuals of the
target population from which to take a sample, as well as a process for selecting the sample.
Any resource that has the information needed to reach every individual in the targeted group
qualifies as a source.
One of the first steps in creating a research study is to define all the modules (also known as
cases) you want to investigate. People, organizations, and existing records might all be
considered units. The population of research interest is made up of various units. It is critical
to be as detailed as possible when describing the population.
Be assertive when selecting lists! Make sure your sample frame is large enough for your
requirements. A decent sample frame for research on living conditions, for example, might
include:
• A file containing factual information that may be used to reach specific people.
Other considerations:
• Each member has a unique identification. This might be a short number code (e.g., from
1 to 3000).
• The list should be well organized. Sort them alphabetically for better access
• Information should be up to date. This might need to be examined regularly (e.g., for
address or contact number changes).
A sample frame needs to be current and applicable. The sampling frame in statistics comes
in two different varieties. They consist of list frames and area frames. Typically, an area
frame is used to start the survey, followed by a list frame.
1 - Area frames
Frames for area sampling encompass a vast geographic region. These regions often
encompass enormous areas, are well-defined and mapped, and contain population statistics.
Area frames often include a whole country, with national census data as the initial reference
point.
2 - List frames
A list frame is a frame that includes a list of the intended audience. There may be more
elements found after looking through the census list. This data may serve as the foundation of
a list frame.
Sampling Error
A sampling error is a statistical error that occurs when an analyst does not select a sample that
represents the entire population of data. As a result, the results found in the sample do not
represent the results that would be obtained from the entire population.
• A sampling error occurs when the sample used in the study is not representative of the
whole population.
• Even randomized samples will have some degree of sampling error because a sample
is only an approximation of the population from which it is drawn.
• The prevalence of sampling errors can be reduced by increasing the sample size.
• In general, sampling errors can be placed into four categories: population-specific error,
selection error, sample frame error, or non-response error.
• Population-Specific Error
• Selection Error
Selection error occurs when the survey is self-selected, or when only those participants who
are interested in the survey respond to the questions. Researchers can attempt to overcome
selection error by finding ways to encourage participation.
A sample frame error occurs when a sample is selected from the wrong population data.
• Non-response Error
A non-response error occurs when a useful response is not obtained from the surveys because
researchers were unable to contact potential respondents (or potential respondents refused to
respond).
Researchers might attempt to reduce sampling errors by replicating their study. This could be
accomplished by taking the same measurements repeatedly, using more than one subject or
multiple groups, or by undertaking multiple studies.
There are two methods by which this sampling error can be reduced. The methods are
2. Stratification
From a population, we can select any sample of any size. The size depends on the experiment
and the situation. If the size of the sample increases, the chance of occurrence of the sampling
error will be less. There will be no error if the sample size and the population size coincide.
Hence, sampling error is in inverse proportion to the sample size.
Stratification
If all the population units are homogeneous or the population has the same characteristic
feature, it’s very easy to get a sample. The sample can be taken as a representative of the entire
population. But if the population is not homogeneous (i.e population with the different
characteristic features); it is impossible to get a perfect sample. In such conditions, to get a
better representative, the sample design is altered. The population is classified into different
groups called strata, that contain similar units. From each of these strata, a sub-sample is
selected in a random manner. Thus, all the groups are defined in the sample, the sampling error
is reduced. Hence, the sub-sample size from each stratum is in proportion with the stratum size.
Sample Size
Sample size refers to the number of observations or data points collected in a study, and it plays
a critical role in the accuracy and reliability of research findings. An appropriately calculated
sample size ensures that the results are representative of the entire population, minimizing bias
while maximizing precision.
Choosing a sample size that is too small may lead to inconclusive results, whereas a too-large
sample size can waste resources and complicate data management.
Sample size’ is a market research term used for defining the number of individuals included in
conducting research. Researchers choose their sample based on demographics, such as
age, gender questions, or physical location. It can be vague or specific.
For example, you may want to know what people within the 18-25 age range think of your
product. Or, you may only require your sample to live in the United States, giving you a wide
population range. The total number of individuals in a particular sample is the sample size.
Sample size” is a term used in market research to define the number of subjects included in a
sample for a study. This sample is selected from the general population and is considered
representative of the population for that specific study.
For example, if we want to predict how the population in a specific age group will react to a
new product, we can first test it on a sample size that is representative of the targeted
population.
In this case, the sample size refers to the number of individuals from that age group who will
be surveyed.
Determining the appropriate sample size involves using statistical formulas that begin with
choosing a significant benchmark based on the expected outcomes of the qualitative research.
Researchers typically have two main approaches to choose from:
Non-response
Non-response refers to the unavailability of sampled units. In a survey, it is likely that it will
not be possible to reach all members of the sample. For example, individuals may be
unavailable because they have moved with no forwarding address. In probability sampling,
non-response reduces sample size, affecting the calculation of sampling error and confidence
intervals.
Nonresponse bias occurs when survey participants are unwilling or unable to respond to a
survey question or an entire survey. Reasons for nonresponse vary from person to person.
In addition to requests for sensitive information and invitation issues, there are several other
causes of nonresponse bias, including poor survey design, wrong target audience, refusals,
failed delivery, and accidental omission.
Poor survey design: Make sure your survey is short and easy to understand to reduce the risk
of nonresponse bias.
Wrong target audience: Before you send out your survey, ensure you’re using the right target
audience. For example, a survey about working hours and wages sent to students and
unemployed individuals will have fewer responses than if it is sent to employed people.
Refusals: Some customers will just say “no” to completing a survey. It could be a bad day or
time for them, or they may just not want to do it. Remember, just because they said “no” today
doesn’t mean they won’t take one of your surveys another time.
Failed delivery: It’s unfortunate that some surveys end up going directly into a spam folder.
You might not even know that your survey wasn’t received, and it will just be recorded as a
nonresponse. Before you send your survey out, we suggest you track respondents to know if
your email was opened, how many clicked through to your survey, and who responded to your
survey.
Accidental omission: On occasion, someone will simply forget to complete your survey. It’s
challenging to prevent this from happening, and hopefully, this is only a small number of your
nonresponses.
Nonresponse bias is almost impossible to eliminate completely, but there are a few ways to
ensure that it is avoided as much as possible. Of course, having a professional, well-structured
and designed survey will help get higher completion rates, but here is a list of ways to tweak
your research process to ensure that your survey has a low nonresponse bias:
• Pretest survey mediums: it is very important to ensure that your survey and its invites
run smoothly through any medium or on any device your potential respondents might
use. People are much more likely to ignore survey requests if loading times are long,
questions do not fit properly on their screens, or they have to work to make the survey
compatible with their device. The best advice is to acknowledge your sample`s different
forms of communication software and devices and pre-test your surveys and invites on
each, ensuring your survey runs smoothly for all your respondents.
• Set expectations: Use an email before the survey goes out or an introduction to the
survey when it’s sent to explain what your participant should expect from the survey.
Include the survey goal, the approximate time it will take to complete, and any
information about anonymity or confidentiality that you deem appropriate.
• Leverage customer data: This is the perfect time to review your buyer personas to
help you identify target audiences for your survey. Review customer accounts for those
who have interacted with your brand in the past and may want to participate in
providing feedback. Gain valuable insights and connect with customers that may be in
risk of churn.
• Avoid rushed or short data collection periods: One of the worst things a researcher
can do is limit their data collection time in order to comply with a strict deadline. Your
study’s level of nonresponse bias will climb dramatically if you are not flexible with
the time frames respondents have to answer your survey. Fortunately, flexibility is one
of the main advantages to online surveys since they do not require interviews (phone
or in person) that must be completed at certain times of the day. However, keeping your
survey live for only a few days can still severely limit a potential respondent’s ability
to answer. Instead, it is recommended to extend a survey collection period to at least
two weeks so that participants can choose any day of the week to respond according to
their own busy schedules.
• Provide options for omissions: It’s important to include an option for participants to
opt-out of answering certain questions. You can do this by not requiring answers to all
questions or providing a multiple-choice option that participants can use to omit the
question, such as “prefer not to answer.”
• Avoid double-barrelled questions: Double-barrelled questions are those that mention
more than one issue but only allow for one answer to cover everything. These questions
are confusing and tough to answer. For example, if you ask, “Were the host and wait
staff polite and helpful?” asks the respondent to rate the host and the wait staff with one
answer when it would be better addressed with two questions.
• Keep questions neutral: Put your ego aside and offer all options in your survey
questions. Rather than ask, “How was our service?” and only offer Good, Great, and
Excellent as choices, use a Likert scale to provide a full range of response options—
without any researcher bias.
• Consider closed-ended questions: Make your survey easy to answer with closed-
ended questions like Likert scales and multiple-choice questions. The survey is easier
and faster to complete with a fixed number of responses.
• Send reminders: Sending a few reminder emails throughout your data collection
period has been shown to effectively gather more completed responses. It is best to send
your first reminder email midway through the collection period and the second near the
end of the collection period. Make sure you do not harass the people on your email list
who have already completed your survey!
• Ensure confidentiality: Any survey that requires information that is personal in nature
should include reassurance to respondents that the data collected will be kept
completely confidential. This is especially the case in surveys that are focused on
sensitive issues. Make certain someone reading your invite understands that the
information they provide will be viewed as part the whole sample and not individually
scrutinized.
• Use incentives: Many people refuse to respond to surveys because they feel they do
not have the time to spend answering questions. An incentive is usually necessary to
motivate people into taking part in your study. Depending on the length of the survey,
the difficulty in finding the correct respondents (ie: one-legged, 15th-century spoon
collectors), and the information being asked, the incentive can range from minimal to
substantial in value. Remember, most respondents won’t have an invested interest in
your study and must feel that the survey is worth their time!
• Examine timing and distribution methods: When should you send your survey for
the highest number of respondents? How should you distribute your survey most
effectively? Some of this timing and delivery is trial and error, but we can say that
response rates are generally higher on Monday and lowest on Friday. Whether you send
your survey by web link, email, website, social media, or through SurveyMonkey
Audience, it all depends on your target audience and what’s relevant to them.
• Close the feedback loop: Be sure to send a thank you or follow-up email to let
respondents know that their input is appreciated and their responses will be addressed
and applied as you seek to improve your products and services. Participants will be
happy to know that their feedback will have an impact.
• Probability Sampling: The probability sampling method utilizes some form of random
selection. In this method, all the eligible individuals have a chance of selecting the
sample from the whole sample space. This method is more time consuming and
expensive than the non-probability sampling method. The benefit of using probability
sampling is that it guarantees the sample that should be the representative of the
population.
In this case each individual is chosen entirely by chance and each member of the population
has an equal chance, or probability, of being selected. One way of obtaining a random sample
is to give each individual in a population a number, and then use a table of random numbers to
decide which individuals to include.1 For example, if you have a sampling frame of 1000
individuals, labelled 0 to 999, use groups of three digits from the random number table to pick
your sample. So, if the first three numbers from the random number table were 094, select the
individual labelled “94”, and so on.
As with all probability sampling methods, simple random sampling allows the sampling error
to be calculated and reduces selection bias. A specific advantage is that it is the most
straightforward method of probability sampling. A disadvantage of simple random sampling is
that you may not select enough individuals with your characteristic of interest, especially if
that characteristic is uncommon. It may also be difficult to define a complete sampling frame
and inconvenient to contact them, especially if different forms of contact are required (email,
phone, post) and your sample units are scattered over a wide geographical area.
Pros: Simple random sampling is easy to do and cheap. Designed to ensure that every member
of the population has an equal chance of being selected, it reduces the risk of bias compared to
non-random sampling.
Cons: It offers no control for the researcher and may lead to unrepresentative groupings being
picked by chance.
2. Systematic Sampling
In the systematic sampling method, the items are selected from the target population by
selecting the random selection point and selecting the other methods after a fixed sample
interval. It is calculated by dividing the total population size by the desired population size.
Systematic sampling is similar to simple random sampling, but it is usually slightly easier to
conduct. Every member of the population is listed with a number, but instead of randomly
generating numbers, individuals are chosen at regular intervals.
Systematic sampling is similar to simple random sampling, but it is usually slightly easier to
conduct. Every member of the population is listed with a number, but instead of randomly
generating numbers, individuals are chosen at regular intervals.
Pros: Systematic sampling is efficient and straightforward, especially when dealing with
populations that have a clear order. It ensures a uniform selection across the population.
Cons: There’s a potential risk of introducing bias if there’s an unrecognized pattern in the
population that aligns with the sampling interval.
3. Stratified Sampling
In a stratified sampling method, the total population is divided into smaller groups to complete
the sampling process. The small group is formed based on a few characteristics in the
population. After separating the population into a smaller group, the statisticians randomly
select the sample. Stratified sampling involves dividing the population into subpopulations that
may differ in important ways. It allows you draw more precise conclusions by ensuring that
every subgroup is properly represented in the sample.
Stratified sampling involves dividing the population into subpopulations that may differ in
important ways. It allows you draw more precise conclusions by ensuring that every subgroup
is properly represented in the sample.
To use this sampling method, you divide the population into subgroups (called strata) based on
the relevant characteristic (e.g., gender identity, age range, income bracket, job role).
Pros: Stratified sampling enhances the representation of all identified subgroups within a
population, leading to more accurate results in heterogeneous populations.
Cons: This method requires accurate knowledge about the population’s stratification, and its
design and execution can be more intricate than other methods.
4. Cluster Sampling
In the clustered sampling method, the cluster or group of people are formed from the population
set. The group has similar significatory characteristics. Also, they have an equal chance of
being a part of the sample. This method uses simple random sampling for the cluster of
population. In a clustered sample, subgroups of the population are used as the sampling unit,
rather than individuals. The population is divided into subgroups, known as clusters, which are
randomly selected to be included in the study.
Cluster sampling also involves dividing the population into subgroups, but each subgroup
should have similar characteristics to the whole sample. Instead of sampling individuals from
each subgroup, you randomly select entire subgroups. This method is good for dealing with
large and dispersed populations, but there is more risk of error in the sample, as there could be
substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are
really representative of the whole population.
Pros: Cluster sampling is economically beneficial and logistically easier when dealing with
vast and geographically dispersed populations.
Cons: Due to potential similarities within clusters, this method can introduce a greater
sampling error compared to other methods.
1. Convenience Sampling
In a convenience sampling method, the samples are selected from the population directly
because they are conveniently available for the researcher. The samples are easy to select, and
the researcher did not choose the sample that outlines the entire population. A convenience
sample simply includes the individuals who happen to be most accessible to the researcher.
This is an easy and inexpensive way to gather initial data, but there is no way to tell if the
sample is representative of the population, so it can’t produce generalizable results.
Convenience samples are at risk for both sampling bias and selection bias.
Convenience sampling is perhaps the easiest method of sampling, because participants are
selected based on availability and willingness to take part. Useful results can be obtained, but
the results are prone to significant bias, because those who volunteer to take part may be
different from those who choose not to (volunteer bias), and the sample may not be
representative of other characteristics, such as age or sex. Note: volunteer bias is a risk of all
non-probability sampling methods.
Pros: Convenience sampling is the most straightforward method, requiring minimal planning,
making it quick to implement.
Cons: Due to its non-random nature, the method is highly susceptible to biases, and the results
are often lacking in their application to the real world.
2. Purposive/Judgemental sampling
This type of sampling, also known as judgement sampling, involves the researcher using their
expertise to select a sample that is most useful to the purposes of the research.
It is often used in qualitative research, where the researcher wants to gain detailed knowledge
about a specific phenomenon rather than make statistical inferences, or where the population
is very small and specific. An effective purposive sample must have clear criteria and rationale
for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware
of observer bias affecting your arguments.
In purposive sampling, the samples are selected only based on the researcher’s knowledge. As
their knowledge is instrumental in creating the samples, there are the chances of obtaining
highly accurate answers with a minimum marginal error. It is also known as judgmental
sampling or authoritative sampling.
Judgement sampling has the advantage of being time-and cost-effective to perform whilst
resulting in a range of responses (particularly useful in qualitative research). However, in
addition to volunteer bias, it is also prone to errors of judgement by the researcher and the
findings, whilst being potentially broad, will not necessarily be representative.
Pros: Purposive sampling targets specific criteria or characteristics, making it ideal for studies
that require specialised participants or specific conditions.
Cons: It’s highly subjective and based on researchers’ judgment, which can introduce biases
and limit the study’s real-world application.
3. Snowball sampling
If the population is hard to access, snowball sampling can be used to recruit participants via
other participants. The number of people you have access to “snowballs” as you get in contact
with more people. The downside here is also representativeness, as you have no way of
knowing how representative your sample is due to the reliance on participants recruiting others.
This can lead to sampling bias.
Snowball sampling is also known as a chain-referral sampling technique. In this method, the
samples have traits that are difficult to find. So, each identified member of a population is asked
to find the other sampling units. Those sampling units also belong to the same targeted
population.
This method is commonly used in social sciences when investigating hard-to-reach groups.
Existing subjects are asked to nominate further subjects known to them, so the sample increases
in size like a rolling snowball. For example, when carrying out a survey of risk behaviours
amongst intravenous drug users, participants may be asked to nominate other users to be
interviewed.
Snowball sampling can be effective when a sampling frame is difficult to identify. However,
by selecting friends and acquaintances of subjects already investigated, there is a significant
risk of selection bias (choosing a large number of people with similar characteristics or views
to the initial individual identified).
Cons: The method can introduce bias due to the reliance on participant referrals, and the choice
of initial seeds can significantly influence the final sample.
4. Quota Sampling
In the quota sampling method, the researcher forms a sample that involves the individuals to
represent the population based on specific traits or qualities. The researcher chooses the sample
subsets that bring the useful collection of data that generalizes the entire population.
This method of sampling is often used by market researchers. Interviewers are given a quota
of subjects of a specified type to attempt to recruit. For example, an interviewer might be told
to go out and select 20 adult men, 20 adult women, 10 teenage girls and 10 teenage boys so
that they could interview them about their television viewing. Ideally the quotas chosen would
proportionally represent the characteristics of the underlying population.
Whilst this has the advantage of being relatively straightforward and potentially representative,
the chosen sample may not be representative of other characteristics that weren’t considered (a
consequence of the non-random nature of sampling).
Pros: Quota sampling ensures certain subgroups are adequately represented, making it great
for when random sampling isn’t feasible but representation is necessary.
Cons: The selection within each quota is non-random and researchers’ discretion can influence
the representation, which both strongly increase the risk of bias.
The below table shows a few differences between probability sampling methods and non-
probability sampling methods.
These are also known as Random sampling These are also called non-random sampling
methods. methods.
These are used for research which is conclusive. These are used for research which is exploratory.
These involve a long time to get the data. These are easy ways to collect the data quickly.
Sample size determination is the process of choosing the right number of observations or
people from a larger group to use in a sample. The goal of figuring out the sample size is to
ensure that the sample is big enough to give statistically valid results and accurate estimates of
population parameters but small enough to be manageable and cost-effective.
In many research studies, getting information from every member of the population of interest
is not possible or useful. Instead, researchers choose a sample of people or events that is
representative of the whole to study. How accurate and precise the results are can depend a lot
on the size of the sample.
Choosing the statistically significant sample size depends on a number of things, such as the
size of the population, how precise you want your estimates to be, how confident you want to
be in the results, how different the population is likely to be, and how much money and time
you have for the study. Statistics are often used to figure out how big a sample should be for a
certain type of study and research question.
Figuring out the sample size is important in ensuring that research findings and conclusions are
valid and reliable.
Let’s say you are a market researcher in the US and want to send out a survey or questionnaire.
The survey aims to understand your audience’s feelings toward a new cell phone you are about
to launch. You want to know what people in the US think about the new product to predict the
phone’s success or failure before launch.
Hypothetically, you choose the population of New York, which is 8.49 million. You use a
sample size determination formula to select a sample of 500 individuals that fit into
the consumer panel requirement. You can use the responses to help you determine how your
audience will react to the new product.
However, determining a sample size requires more than just throwing your survey at as many
people as possible. If your estimated sample sizes are too big, it could waste resources, time,
and money. A sample size that’s too small doesn’t allow you to gain maximum insights, leading
to inconclusive results.
Before we jump into sample size determination, let’s take a look at the terms you should know:
1. Population size:
Population size is how many people fit your demographic. For example, you want to get
information on doctors residing in North America. Your population size is the total number of
doctors in North America.
Don’t worry! Your population size doesn’t always have to be that big. Smaller population sizes
can still give you accurate results as long as you know who you’re trying to represent.
2. Confidence level:
The confidence level tells you how sure you can be that your data is accurate. It is expressed
as a percentage and aligned to the confidence interval. For example, if your confidence level is
90%, your results will most likely be 90% accurate.
There’s no way to be 100% accurate when it comes to surveys. Confidence intervals tell you
how far off from the population means you’re willing to allow your data to fall.
A margin of error describes how close you can reasonably expect a survey result to fall relative
to the real population value. Remember, if you need help with this information, use our margin
of error calculator.
4. Standard deviation:
Standard deviation is the measure of the dispersion of a data set from its mean. It measures the
absolute variability of a distribution. The higher the dispersion or variability, the greater the
standard deviation and the greater the magnitude of the deviation.
For example, you have already sent out your survey. How much variance do you expect in your
responses? That variation in response is the standard deviation.
With all the necessary terms defined, it’s time to learn how to determine sample size using a
sample calculation formula.
Your confidence level corresponds to a Z-score. This is a constant value needed for this
equation. Here are the z-scores for the most common confidence levels:
If you choose a different confidence level, various online tools can help you find your score.
Here is an example of how the math works, assuming you chose a 90% confidence level, .6
standard deviation, and a margin of error (confidence interval) of +/- 4%.
.9648 / .0016
=603
603 respondents are needed, and that becomes your sample size.
Determining the right sample size for your survey is one of the most common questions
researchers ask when they begin a market research study. Luckily, sample size determination
isn’t as hard to calculate as you might remember from an old high school statistics class.
Before calculating your sample size, ensure you have these things in place:
What do you hope to do with the survey? Are you planning on projecting the results onto
a whole demographic or population? Do you want to see what a specific group thinks? Are you
trying to make a big decision or just setting a direction?
Calculating sample size is critical if you’re projecting your survey results on a larger
population. You’ll want to make sure that it’s balanced and reflects the community as a whole.
The sample size isn’t as critical if you’re trying to get a feel for preferences.
For example, you’re surveying homeowners across the US on the cost of cooling their homes
in the summer. A homeowner in the South probably spends much more money cooling their
home in the humid heat than someone in Denver, where the climate is dry and cool.
For the most accurate results, you’ll need to get responses from people in all US areas and
environments. If you only collect responses from one extreme, such as the warm South, your
results will be skewed.
Precision level:
How close do you want the survey results to mimic the true value if everyone responded?
Again, if this survey determines how you’re going to spend millions of dollars, then your
sample size determination should be exact.
The more accurate you need to be, the larger the sample you want to have, and the more your
sample will have to represent the overall population. If your population is small, say, 200
people, you may want to survey the entire population rather than cut it down with a sample.
Confidence level:
Think of confidence from the perspective of risk. How much risk are you willing to take on?
This is where your Confidence Interval numbers become important. How confident do you
want to be — 98% confident, 95% confident?
Understand that the confidence percentage you choose greatly impacts the number of
completions you’ll need for accuracy. This can increase the survey’s length and how many
responses you need, which means increased costs for your survey.
Knowing the actual numbers and amounts behind percentages can help make more sense of
your correct sample size needs vs. survey costs.
For example, you want to be 99% confident. After using the sample size determination formula,
you find you need to collect an additional 1000 respondents.
This, in turn, means you’ll be paying for samples or keeping your survey running for an extra
week or two. You have to determine if the increased accuracy is more important than the cost.
Population variability:
What variability exists in your population? In other words, how similar or different is the
population?
If you are surveying consumers on a broad topic, you may have lots of variations. You’ll need
a larger sample size to get the most accurate picture of the population.
However, if you’re surveying a population with similar characteristics, your variability will be
less, and you can sample fewer people. More variability equals more samples, and less
variability equals fewer samples. If you’re not sure, you can start with 50% variability.
Response rate:
You want everyone to respond to your survey. Unfortunately, every survey comes with targeted
respondents who either never open the study or drop out halfway. Your response rate will
depend on your population’s engagement with your product, service organization, or brand.
The higher the response rate, the higher your population’s engagement level. Your base sample
size is the number of responses you must get for a successful survey.
Besides the variability within your population, you need to ensure your sample doesn’t include
people who won’t benefit from the results. One of the biggest mistakes you can make in sample
size determination is forgetting to consider your actual audience.
For example, you don’t want to send a survey asking about the quality of local apartment
amenities to a group of homeowners.
You may start with general demographics and characteristics, but can you narrow those
characteristics down even more? Narrowing down your audience makes getting a more
accurate result from a small sample size easier.
For example, you want to know how people will react to new automobile technology. Your
current population includes anyone who owns a car in a particular market.
However, you know your target audience is people who drive cars that are less than five years
old. You can remove anyone with an older vehicle from your sample because they’re unlikely
to purchase your product.
Once you know what you hope to gain from your survey and what variables exist within your
population, you can decide how to calculate sample size. Using the formula for determining
sample size is a great starting point to get accurate results.
In sample size determination, statistical analysis plan needs careful consideration of the level
of significance, effect size, and sample size.
Researchers must reconcile statistical significance with practical and ethical factors like
practicality and cost. A well-designed study with a sufficient sample size can improve the odds
of obtaining statistically significant results.
To meet the goal of your survey, you may have to try a few methods to increase the response
rate, such as:
• To reach a wider audience, use multiple distribution channels, such as SMS, website,
and email surveys.
• Offer incentives for completing the survey, such as an entry into a prize drawing or a
discount on the respondent’s next order.
• Consider your survey structure and find ways to simplify your questions. The less work
someone has to do to complete the survey, the more likely they will finish it.
• Longer surveys tend to have lower response rates due to the length of time it takes to
complete the survey. In this case, you can reduce the number of questions in your survey
to increase responses.
Practical Consideration
*************************************