Unit-III Data Collection
UNIT - III DATA COLLECTION
Types of data – Primary Vs Secondary data –
Methods of primary data collection – Survey Vs
Observation – Experiments – Construction of
questionnaire and instrument – Validation of
questionnaire – Sampling plan – Sample size –
determinants optimal sample size – sampling
techniques – Probability Vs Non–probability sampling
methods.
TYPES OF DATA
Primary Data Secondary Data
PRIMARY DATA
Primary data are those which are
collected afresh and for the first time
and thus happen to be original in
character.
SECONDARY DATA
Secondary data are those which have
already been collected by someone
else and which have already been
passed through the statistical process.
METHODS OF PRIMARY DATA
COLLECTION
The primary data is collected during the
course of doing experiments in an
experimental research.
In case of descriptive research the
primary data can be obtained through
observation or through direct
communication with respondents.
Experiment
An experiment refers to an investigation in
which a factor or variable under test is
isolated and its effect(s) measured.
In an experiment the investigator measures
the effects of an experiment which he
conducts intentionally.
There are several methods of collecting
primary data, particularly in surveys and
descriptive researches. Important ones are:
(i) Observation method,
(ii) Survey method
(a) Interview method,
(b) Questionnaire method,
(c) Schedule
Observation Method
Observation involves recording the
behavioral patterns of people, object,
and events in a systematic manner to
obtain information about the phenomenon
of interest.
Under the observation method, the
information is sought by way of
investigator’s own direct observation
without asking from the respondent.
For instance, in a study relating to
consumer behaviour, the investigator instead
of asking the brand of wrist watch used by
the respondent, may himself look at the
watch.
While using this method, the researcher
should keep in mind things like:
What should be observed?
How the observations should be recorded?
Or
How the accuracy of observation can be
ensured?
Methods of observation
1. Structured and unstructured
observation :
structured observation is used when the
research problem has been formulated
precisely and the researcher is told to
observe the area of study.
Unstructured observation implies that
the researchers are free to observe
whatever they feel relevant and
judicious.
2. Disguised observation:
The subjects or informants do not know
that they are being observed.
This is preferred as it is feared that people
may behave differently when they know
somebody is observing them.
3. Observation under natural setting and
laboratory setting:
Observations in field studies are in their
natural setting and are studied in extremely
realistic conditions.
Observational studies in laboratory setting
enable prompt collection of data.
4. Direct and indirect observation:
In direct observation, the behavior of
person is observed
Indirect observation implies that same
record of past behavior is observed.
5. Human and mechanical observation:
Most of the research in marketing research
are based on human observations.
In some cases, mechanical devices such
as eye cameras and audiometers are used
for observation.
Surveys
Survey refers to the method of
securing information concerning a
phenomena under study from all or a
selected number of respondents of the
concerned universe.
TYPES OF SURVEY
1. Consumer survey – to asses the
particular market and the attitude of
consumers
2. Trade survey – interviewing people in the
distribution channels and trade associations
3. Consumer panel – Pre-recruited group of
consumers who have agreed to participate in
market research studies
4. Retail audit – Gets information from the
Interview Method
Interview is a meaningful interaction between
interviewer and interviewee.
Interview is a conversation between two or more
people where questions are asked by the interviewer
to obtain information from the interviewee.
Interview Method
This method can be used through personal interviews
and, if possible, through telephone interviews.
(a) Personal interviews:
Personal interview method requires a person known
as the interviewer asking questions generally in a
face-to-face contact to the other person or persons.
This sort of interview may be in the form of direct
personal investigation or it may be indirect oral
investigation.
(b) Telephone interviews:
This method of collecting information consists in
contacting respondents on telephone itself.
The chief merits of such a system are:
1. It is flexible in comparison to mailing method.
2. It is faster than other methods i.e., a quick way of
obtaining information.
3. Replies can be recorded without causing
embarrassment to respondents.
Questionnaire
The term questionnaire usually refers to
a self-administered process where the
respondent himself reads the question
and records his answers without the
assistance of the interviewer.
The questionnaire method
This is the simplest and most often used method of
primary data collection
There is a pre-determined set of questions in a
sequential format
Is designed to suit the respondent’s understanding
and language command
Can be conducted to collect useful data from a large
population in a short duration of time
Construction of questionnaire
and instrument
Determine what information wanted
Determine the type of questionnaire to use
Personal interview
Mail interview
Telephone interview
• Determine the content of individual
questions
• Determine the type of question to use
Open – ended questions
Close – ended questions
Dichotomous
Multiple response
Scales
Type of questions
Question Content
Open – ended Closed - ended
Dichotomous Multiple Scales
Responses
Ty pe of ques tions
Open ended questions:
What is your age?
How would you evaluate the work done by the present government?
How much orange juice does this bottle contain?
What is your reaction to this new soft drinks?
Which is your favorite TV serial?
What training programme have you last attended?
With whom in your work group do you interact with after office hours?
Type of questions
Closed ended questions
1. Dichotomous questions
Are you diabetic? Yes / No
Have you read the new book by Dan Brown? Yes/no
What kind of petrol do you use in your car? Normal/Premium
What kind of cola do you drink? Normal/diet
Your working hours in the organization are fixed/ flexible
Ty pe of ques tions
Closed ended questions
2. Multiple choice questions
How much do you spend on grocery products
(average in one month)?
- Less than Rs. 2500/-
- Between Rs 2500-5000/-
More than Rs 5000/-
Deciding on wording of questions
Use simple words
Avoid ambiguity in questioning
Decide on question sequence
Basic information
Classification information
Identification information
Decide on length of questionnaire
Pre-test
Final draft
Validation of questionnaire
1. Content validation:
It refers to face validity
Comparing the questionnaire with other similar
questionnaire
2. Sampling validation:
A large sample size can ensure low sampling
errors
3. Empirical validity:
It examines the survey results by comparison with
other studies
SCHEDULES
Schedule is an instrument in research
It contains questions and blank tables which are to be
filled in by the investigators themselves after getting
information from the respondents.
According to Goode and Hatt, “Schedule is that name
usually applied to a set of questions which are asked
and filled in by an interviewer in a face-to-face
situation with another person”.
Criteria for questionnaire design
The spelt out research objectives need to
be converted into specific questions
It must be designed to engage the
respondent and encourage meaningful
response
The questions should be designed in
simple language and be self-explanatory
COLLECTION OF SECONDARY
DATA
Secondary data means data that are already available
i.e., they refer to the data which have already been
collected and analyzed by someone else.
Secondary data may either be published data or
unpublished data. Usually published data are available
in:
(a) various publications of the central, state are local
governments
(b) various publications of foreign governments or of
international bodies and their subsidiary organizations
(c) technical and trade journals
(d) books, magazines and newspapers
(e) reports and publications of various associations
connected with business and industry, banks, stock
exchanges
(f) reports prepared by research scholars,
universities, economists, etc. in different fields and
(g) public records and statistics, historical
documents, and other sources of published
information.
(h) Libraries
(i) Advertising Agencies
SAMPLING PLAN
Sampling Concepts
Population
Population refers to any group of people or objects
that form the subject of study in a particular survey
and are similar in one or more ways. It may be finite
or infinite
Ex:
Population of books in the National Library
Population of Nationalized Banks in India
Sample
It is a subset of the population. It
comprises only some elements of the
population.
Sampling
It is the process of obtaining information
about an entire population by examining only
a part of it.
Element (Sampling unit): An element comprises
a single member of the population.
Sampling frame: Sampling frame comprises all
the elements of a population with proper
identification that is available to us for selection at
any stage of sampling.
Census (or complete enumeration): An
examination of each and every element of the
population is called census or complete
enumeration.
Advantages of Sample over
Census
Sample saves time and cost.
A decision-maker may not have too much of time to wait
till all the information is available.
There are situations where a sample is the only option.
The study of a sample instead of complete enumeration
may, at times, produce more reliable results.
A census is appropriate when the population size is small.
Sampling vs Non-Sampling Error
Sampling error: This error arises when a sample is not
representative of the population.
Non-sampling error: This error arises not because a
sample is not a representative of the population but because
of other reasons. Some of these reasons are listed below:
Plain lying by the respondent.
The error can arise while transferring the data from the
questionnaire to the spreadsheet on the computer.
There can be errors at the time of coding,
tabulation and computation.
Population of the study is not properly
defined
Respondent may refuse to be part of
the study.
SAMPLING PLAN
Define the Universe
The target population must be defined precisely.
It involves translating the problem definition into a
precise statement of who should and should not
be included in the sample.
Sample frame
A list containing all sampling units of a population
is known as sampling frame.
It involves a definite location, a boundary
Specifying the sampling units
The sampling unit is the basic unit containing the
elements of the population to be sampled
Eg: city blocks, households, a business
organization
For ex: Revlon wanted to assess consumer response
to a new line of lipsticks and wanted to sample
females over 18 years of age. It may be possible to
sample females over 18 directly, in which case a
sampling unit would be same as an element.
Alternatively, the sampling unit might be households.
In later case, households would be sampled and all
females over 18 in each selected household would be
interviewed.
Selection of sample design
Probability Sampling Design - In a probability
sampling design, each and every element of the
population has a known chance of being selected in
the sample.
Non-probability Sampling - the elements of the
population do not have any known chance of
being selected in the sample.
Determination of sample size
The size of the sample size has direct
relationship with degree of accuracy desired
in investigation
Select the sample
Execute the actual sampling process.
It is the actual selection of the sample
elements.
SAMPLE SIZE
The sample size of a statistical sample is the number
of observations that constitute it.
Determination of Sample Size
The size of the population does not influence the size of the
sample
Methods of determining the sample size in practice:
Researchers may arbitrary decide the size of sample
without giving any explicit consideration to the accuracy of
the sample results or the cost of sampling.
The total budget for the field survey in a project proposal
is allocated.
Researchers may decide on the sample size based on
what was done by the other researchers in similar
studies.
Determination of Sample Size
Confidence interval approach for determining the size
of the sample
The following points are taken into account for
determining the sample size in this approach.
The variability of the population: Higher the variability as
measured by the population standard deviation, larger
will be the size of the sample.
The confidence attached to the estimate: Higher
the confidence the researcher wants for the
estimate, larger will be sample size.
The allowable error or margin of error: Greater
the precision the research seeks, larger would be
the size of the sample.
Determination of Sample Size
Sample size for estimating population mean -
The formula for determining sample size is given
as:
Where
n = Sample size
σ = Population standard deviation
e = Margin of error
Z = The value for the given confidence interval
Determination of Sample Size
Sample size for estimating population proportion –
1. When population proportion p is known
2. When population proportion p is not known
Sampling Design
Probability Sampling Design - Probability sampling
designs are used in conclusive research. In a probability
sampling design, each and every element of the
population has a known chance of being selected in the
sample.
Types of Probability Sampling Design
1. Simple random sampling
Every element in the population has a known and
equal chance of being selected.
Ex: there are 1000 elements in the population, and
we need a sample of 100.
Suppose we were to drop pieces of paper in a hat,
each bearing the name one of the elements, and
draw 100 of those from the hat with our eyes
closed.
We know that the first piece drawn will have a 1/1000
chance of being drawn, the next one a 1/999 chance
of being drawn and so on.
This sampling has least bias and most
generalizability.
2. Systematic (methodical, orderly) sampling
It involves drawing every ‘n’th element in the
population starting with a randomly chosen element
between 1 and n.
Ex: if we want a sample of 35 households from a
total population of 260 houses in a particular
locality, then we could sample every 7th house
starting from a random numbers from 1 to 7. Let us
say that the random number is 7, then the houses
numbered 7,14, 21, 28, and so on.
3. Stratified random sampling
There may be identifiable subgroups of elements within
the population that may be expected to have different
parameters on a variable of interest to the researcher.
Ex: to the HRM Director interested in assessing the
extent of training that the employees in the system
feel they need, the entire organization will form the
population for the study.
But the extent, quality, and intensity of training
desired by middle level managers, lower-level
managers, first line supervisors, computer analysts,
clerical workers and so on will be different for each
group.
Knowledge of the kinds of differences in needs that
exist for the different groups will help the director to
develop useful and meaningful training programs for
each group in the organization.
Stratified sampling as the name implies, involves a
process of stratification or segregation, followed by
random selection of subjects from each stratum.
The population is first divided into mutually exclusive
groups that are relevant, appropriate and meaningful in
the context of the study.
Some examples: studying customers on the basis of
life stages, income levels and the like to study buying
pattern and stratifying companies according to size,
industry, and profit and so forth to study stock market
reactions.
4. Cluster sampling
Ad hoc organizational committees drawn from various
departments to offer inputs to the company president to
enable him to make decisions on product development,
budget allocations, marketing strategies and the like
are good examples of clusters.
Cluster samples offer more heterogeneity within groups
and more homogeneity among groups.
Sampling Design
Non-probability Sampling Designs - In case
of non-probability sampling design, the
elements of the population do not have any
known chance of being selected in the
sample.
Types of Non-Probability Sampling Design
1. Convenience sampling
It refers to the collection of information from
members of the population who are conveniently
available to provide it.
Ex: Pepsi Challenge contest was administered on a
convenience sampling basis.
Such a contest, with the purpose of determining
whether people prefer one product to another product,
might be held at a shopping mall visited by many
shoppers. Those inclined to take the test might form the
sample for the study of how many people prefer Pepsi
over Coke. Such a sample is a convenience sample.
2. Purposive sampling
Instead of obtaining information from those who are readily
or conveniently available, it might sometimes become
necessary to obtain information from specific target
groups.
The sampling here is confined to specific types of
people who can provide the desired information.
This type of sampling design is called purposive
sampling.
Purposive Sampling is of two types:
Judgmental sampling
Judgment sampling involves the choice of
subjects who are most advantageously in the best
position to provide the required information.
For instance, if a researcher wants to find out what it
takes for women managers to make it to the top,
the only people who can give first hand information
are the women who have risen to the positions of
presidents, vice presidents and important top
level executives in work organizations.
Quota sampling
Quota sampling is a second type of sampling
Generally, the quota fixed for each subgroup is based on
the total number of members in each group in the
population.
For instance, it may be surmised that the work attitude of
blue-collar workers in an organization is quite different
from that of white-collar workers.
If there are 60% blue-collar workers and
40% white-collar workers in this
organization, and if a total of 30 people are
to be interviewed to find answer to the
research question, then a quota of 18 blue-
collar workers and 12 white - collar workers
will form the sample, because these numbers
represent 60% and 40% of the sample size.
3. Snowball sampling
Snowball sampling or chain sampling is a non-probability
sampling technique where existing study subjects recruit
future subjects from among their acquaintances.
END OF CHAPTER