DATA & SAMPLING
IM3105 Research Methods for Business
Sem 1, 2024-2025
Adapted from the Business Research Methods Lecture Notes of Assoc. Prof. Dr. Lê Nguyễn Hậu
Content
1. Sources of data
2. Secondary data
3. Primary & Experimental data
4. Data collection methods
5. Sampling
6. Sample size
1. Sources of data RESEARCH OBJECTIVES
MODEL / HYPOTHESES
DATA NEEDS
DATA SOURCES
SECONDARY DATA
PRIMARY DATA EXPERIMENTAL DATA
In search of data sources
2. Secondary data
Generals
Data used in a research project that are not collected directly and
purposefully for it.
Being thought/considered before primary data.
Secondary data can be:
1. in raw or summarized form;
2. free or fee-based access
2. Secondary data
Characteristics
Advantages: Cost – Time
Disadvantages:
Availability
Appropriateness (detail, measure, timing).
Reliability and accuracy
Usage:
Supportive evidence when formulating a research.
Base for sampling design and data collection method
Base for comparing and interpreting research results
2. Secondary data
Where to collect
Internal: documents in various functional units in the org.
External: databases, publications, internet, social media platforms, statistics
books, etc.
Source: Q&Me: https://qandme.net/en/report/ Source: www.cimigo.com
2. Secondary data
Government Statistics Financial Institution
General Statistics Office of Vietnam Asian Development Bank/ World Bank
demographic, economic, and social statistics economic analysis, development project
www.adb.org; www.worldbank.org
https://gso.gov.vn/
2. Secondary data
Industry Report Industry Report
Vietnam National Administration of Tourism Vietnam Industry Research and Consultancy
data and reports related to tourism industries, market trends, and forecasts
https://vietnamtourism.gov.vn/ https://viracresearch.com/home/
2. Secondary data
2. Secondary data
3. Primary data
Newly collected for the specific purpose of the research at hand
Use when secondary data are not available/inappropriate
Higher value
Requires cost & time
Differences between primary & secondary data are not always clear
3. Experimental data
Data collected from experiments.
An experiment is a process in which the researcher manipulates the state or
value of one or more variables and measures their effects on other variables
while strictly controlling extraneous variables.
3. Experimental data
Independent variable (IV): The “cause” in a relationship.
Dependent variable (DV): The “effect” in a relationship.
Test units: Group of objects under the influence of IV.
Treatments: different states of IV that are manipulated to create an impact on
a group of test units.
Treatment or experimental group: Group of test units under the same
treatment.
Control group: the group of test units being kept out of manipulation;
normally used to compare with experimental groups.
Extraneous: variables that are different from IVs but influence test units.
4. Data collection methods
Primary data can be collected via 2 approaches:
Communication: respondents actively provide the data via direct or indirect
communication
Observation: gathering data without asking respondents questions Informants
passively express the data
4. Data collection methods
Communication approach
Qualitative Data Quantitative Data
Methods Individual/Group interview Survey
Focus group Experiment
Case studies
Versatility Highly versatile
Can ask for feelings, beliefs, attitude, intentions, etc.
Time and Cost Fast and Cost saving
Accuracy/ Reliability Depend upon: problem; nature of data, collection
method, honesty of informants.
Convenience for informants Less convenient
4. Data collection methods
Communication channels
Criteria Ranking
1st 2nd 3rd
Flexibility in No. of questions Personal / Chat Mail / Email Telephone
Data Versatility Personal / Chat Telephone Mail / Email
Time Telephone Personal / Chat Mail / Email
Cost Mail / Email Telephone Personal / Chat
Sample Control Personal / Chat Telephone Mail / Email
Explaining opport. Personal / Chat Telephone Mail / Email
Convenience for informants Mail / Email Telephone Personal / Chat
4. Data collection methods
Survey with questionnaires
4. Data collection methods
Observation approach
Includes the full range of monitoring behavioral and non-behavioral activities/
conditions
Behavioral observation: what they do; what they say; how they say
Non-behavioral observation: historical records; words; sound records;
photographs; videotape.
Store/plant audits; inventory conditions; financial statements
Analysis of traffic flow
4. Data collection methods
Observation approach
Natural setting + No Tool Artificial setting + No Tool
Methods
Natural setting + Tool Artificial setting + Tool
Versatility Limited. Only for observable properties
Time and Cost Slow - Costly
Accuracy/ Reliability Depend upon observation method and tools
Convenience for informants More convenient
5. Sampling
Generals
What:
a process of selecting a small group from a larger population to make
observations or data collection.
Why:
Cost saving
Time saving
Work on a sample in many cases is more accurate
A must in cases the study leads to destroying or changing the attributes of the
objects under investigation.
5. Sampling
Sampling procedure
Define population and elements
Define sampling frame
Determine sample size
Design sampling method
Sample selection
5. Sampling
Two sampling approaches
Probability sampling:
Follow mathematical rules, the researcher cannot interfere.
Used when the representativeness of the sample or generalization of findings is
of critical importance
Nonprobability sampling:
Researcher selects elements into the sample based on subjective judgment or
convenience.
Used when time, cost or other factors are more important.
5. Sampling
Two sampling approaches
Probability sampling Nonprobability sampling
methods methods
Simple random Convenience
Systematic random Judgment
Stratified random Quota
Cluster Snow ball
5. Sampling
Simple random sampling
The purest form of probability sampling.
Each element has a known and equal chance of selection.
Easy to implement with automatic random number generation tools.
However,
requires a list of population elements,
can be time-consuming and expensive,
can require larger sample sizes than other probability methods.
5. Sampling
Systematic random sampling
A versatile form of probability sampling.
Every kth element is selected, starting with a random element in the group of 1 to k.
k (skip interval) = population size / sample size
Major advantage: + Simplicity and flexibility.
+ Easier to instruct field workers
5. Sampling
Systematic random sampling
Procedure:
1. Identify, list, and number the elements in the population.
2. Identify the skip interval (k).
3. Identify the random start.
4. Draw a sample by choosing every kth entry.
May have biases when having periodicity in the population that parallels the
sampling ratio.
5. Sampling
Stratified random sampling
The sample includes random elements from each segment (stratum).
Homogeneous within a stratum and Heterogeneous with other strata.
To allocate #No of elements among various strata: proportionate and
disproportionate options.
5. Sampling
Stratified random sampling
Procedure:
1. Determine the variables to use for stratification.
2. Determine the proportions of the stratification variables in the population.
3. Select proportionate or disproportionate stratification
4. Divide the sampling frame into separate frames for each stratum.
5. Randomize the elements within each stratum’s sampling frame.
6. Use random/systematic procedure to draw elements from each stratum
5. Sampling
Cluster random sampling
The population is divided into groups of elements (clusters) then some groups are
randomly selected.
Heterogeneous within a cluster and Homogeneous with other clusters
Different from stratified random sampling
Different from simple random sampling
5. Sampling
Convenience sampling
A nonprobability sampling technique
Least reliable design, but the cheapest and easiest to conduct.
Choose whoever based on “convenience.”
May still be a useful procedure.
5. Sampling
Judgment sampling
Researchers select sample members to conform to some criterion.
Appropriate in the early stages of an exploratory study.
Or when researchers wish to select a biased group for screening purposes.
Ex: Companies often try out new product ideas on their employees.
The reason is that the firm’s employees will be more favorably disposed toward a
new product idea than the public. If the product does not pass this group, it does
not have prospects for success in the general market.
5. Sampling
Quota sampling
Improve sample representativeness (in nonprobability sampling).
Control certain characteristics describing the population.
In most cases, researchers use more than one control variable.
Each control variable should:
• Have a distribution in the population that we can estimate
• Be pertinent to the topic studied.
EX: If we believe that responses to a question vary depending on the gender of
respondents, we should seek proportional responses from both men and women
using gender as a control variable for quota sampling.
5. Sampling
Snowball sampling
When respondents are difficult to identify and are best located through referral
networks.
Appropriate for some qualitative studies.
Initially, individual(s) are discovered and selected.
Then they refer the researcher to others who possess similar characteristics.
Ex: used to study drug cultures, teenage gang activities, …
5. Sampling
On the accuracy of two sampling approaches
“There is no guarantee that the results obtained with a probability sample
will be more accurate than those obtained with a nonprobability
sample; what the former allows the researcher to do is to measure the
amount of sampling error likely to occur in his/her sample. This provides
a measure of the accuracy of the sample result. With nonprobability
sampling no such error measures exists.”
(Kinnear & Taylor, 1987, p.207).
6. Sample size
Precision level: set the standard error e
(between sample and population estimates)
Confidence level: 95% is commonly accepted (Sig. level p <= 0.05)
Determine Z score in accordance with confidence level
Estimate the population variance
Select apprpriate formula
Calculate the sample size
6. Sample size
Practical issues in determining the sample size
Investigation of one or many variables.
Response rate.
Data analysis method to be used.
Total error includes sampling and non-sampling error.
Time and resources constrained
6. Sample size
Rules of thumb
500 > Sample sizes > 30 are appropriate for most research
Where samples are to be broken into subsamples (males/females,
juniors/seniors, …), a minimum size of 30 for each subgroup is necessary
In multivariate analysis, the sample size should be several times (preferably 10
times or more) as large as the number of variables in the analysis.
(Roscoe, 1975)