Fink - 2010 - Survey Research Methods
Fink - 2010 - Survey Research Methods
A Fink, University of California, Los Angeles, Los Angeles, CA, USA; The Langley Research Institute, Pacific
Palisades, CA, USA
ã 2010 Elsevier Ltd. All rights reserved.
152
Survey Research Methods 153
Surveys are a prominent part of life in many major smoking in students, and a component of that study may
industrialized nations, particularly the United States. US be a survey of students’ smoking habits.
elections are always accompanied by a flood of polls. The A survey’s objectives can be derived from reviews of
US Census, which is administered to the entire nation the literature and other surveys. The literature refers to
every 10 years, is a survey. However, most people encoun- all published and unpublished public reports on a topic.
ter surveys more frequently in their daily lives in medical, Systematic reviews of the literature describe current
educational, and social settings. knowledge and reveal gaps. For instance, a review of the
literature can provide information on best practices in
teaching reading but may not provide sufficient informa-
Survey Objectives tion about the ease of implementing such programs in
classrooms. A survey of reading teachers may provide data
A survey’s objectives are general statements of the sur-
on implementation.
vey’s outcomes and provide the direction for selecting
Survey objectives can also come from experts. Experts
questions (Case study 1).
are individuals who are knowledgeable about a topic, will
When planning a survey, the survey researcher must
be affected by the survey’s outcomes, or are influential in
define all potentially imprecise or ambiguous terms in the
implementing its findings. Experts can be asked about
objectives (Fink, 2004a). For the objectives above, the
objectives by mail or telephone or brought together in
imprecise terms are needs; educational services; charac-
meetings. Two types of meetings that are sometimes used
teristics; and benefits. No standard definition exists for
to help survey researchers identify objectives, research ques-
any of them. What are needs, for example, and of the very
tions, and research hypotheses are focus groups (Krueger
long list that the survey researcher can create, which are
and Casey, 2000) and consensus panels ( Jones, 1995).
so important that they should be included on the survey?
Definitions can come from the literature and from con-
sultation with knowledgeable individuals.
Straightforward Questions and
Survey objectives may be independent of an existing
Responses
study or related to it. For instance, suppose a school
district decides to investigate the causes of a measurable
Survey questions take two primary forms. When they
increase in smoking among students between 12 and
require respondents to use their own words, they are
16 years of age. The district could then ask the survey
called open ended. When the answers or responses are
research department to design and implement a survey of
preselected for the respondent, the question is termed
students who smoke to find out why they do. On the other
closed or forced choice. Both types of questions have
hand, the school district may be part of a study to prevent
advantages and limitations.
An open question is useful when the intricacies of
an issue are still unknown, for eliciting unanticipated
Case study 1 answers, and for describing the world as the respondent
Illustrative objectives for a survey of vocational educational sees it – rather than as the questioner does. Some respon-
needs dents also prefer to state their views in their own words.
Objective 1: Identify the most common needs for educational Sometimes, when left to their own devices, respondents
services for three professions: teaching, nursing, and com- provide quotable material. The disadvantage is that unless
puter programming. you are a trained anthropologist or qualitative researcher,
Sample question:
How proficient would you say you are in performing the
responses to open questions are often difficult to compare
following job-related activities? and interpret.
Objective 2: Compare the educational needs of men and Some respondents prefer closed questions because
women they are either unwilling or unable to express themselves
Question: while being surveyed. Closed questions are more difficult
Are you male or female?
Objective 3: After participation in a vocational education pro-
to write than open ones, however, because the answers or
gram, identify the characteristics of participants who receive response choices must be known in advance; however, the
the most benefits. results lend themselves more readily to statistical analysis
Questions about the characteristics of survey participants: and interpretation, and this is particularly important in
What is your occupation? What was your household income large surveys because of the number of responses and
last year?
Questions about benefits:
respondents. Moreover, as the respondent’s expectations
To what extent has this program helped you improve your job are more clearly spelled out in closed questions (or the
skills? How long did you wait to get a job in your preferred survey researcher’s interpretations of them), the answers
profession? have a better chance of being more reliable or consistent
over time.
154 Education Research Methodology: Quantitative Methods and Research
representative of the study or target population, and every are related to the outcome – in this case, care for health
member of the target population has a known, nonzero needs of homeless families. The justification for the selec-
probability of being included in the sample. Probability tion of the strata can come from the literature and expert
sampling implies the use of random selection. Random opinion.
sampling eliminates subjectivity in choosing a sample. Stratified random sampling is more complicated than
It is a fair way of getting a sample. simple random sampling. The strata must be identified
The second type of sampling is nonprobability or conve- and justified, and using many subgroups can lead to large,
nience sampling. Nonprobability samples are chosen based unwieldy, and expensive surveys.
on judgment regarding the characteristics of the target Systematic sampling is another sampling method used
population and the needs of the survey. With nonprobability by survey researchers. Suppose a researcher has a list with
sampling, some members of the eligible target population the names of 3000 students, from which a sample of 500 is
have a chance of being chosen and others do not. By chance, to be selected. Dividing 3000 by 500 yields 6, which means
the survey’s findings may not be applicable to the target that one of every six persons will be in the sample. To
group at all. True probability sampling is extraordinarily systematically sample from the list, a random start is
complex. Nevertheless, many survey researchers aim to needed. To obtain this, a die can be tossed. Suppose a
come as close to a probability sample as possible. One toss comes up with the number 5. This means the fifth
approach is simple random sampling. name on the list would be selected first, then the 11th,
In simple random sampling, every subject or unit has 17th, 23rd, and so on until 500 names are selected.
an equal chance of being selected. Members of the target To obtain a valid sample, the researcher must obtain
population are selected one at a time and independently. a list of all eligible participants or members of the popu-
Once they have been selected, they are not eligible for a lation. This is called the sampling frame. Systematic
second chance and are not returned to the pool. Because sampling should not be used if repetition is a natural
of this equality of opportunity, random samples are con- component of the sampling frame. For example, if the
sidered relatively unbiased. Typical ways of selecting a frame is a list of names, systematic sampling can result
simple random sample are using a table of random num- in the loss of names that appear infrequently (e.g., names
bers or a computer-generated list of random numbers and beginning with X). If the data are arranged by months and
applying it to lists of prospective participants. the interval is 12, the same months will be selected for
The advantage of simple random sampling is that the each year. Infrequently appearing names and ordered data
survey researcher can get an unbiased sample without ( January is always month 1 and December month 12)
much technical difficulty. Unfortunately, random sampling prevents each sampling unit (names or months) from
may not pick up all the elements of interest in a population. having an equal chance of selection. If systematic sam-
Suppose a researcher is conducting a survey of teacher pling is used without the guarantee that all units have an
satisfaction. Consider also that the researcher has evidence equal chance of selection, the resultant sample will not be
from a previous study that older and younger teachers a probability sample. When the sampling frame has no
usually differ substantially in their satisfaction. If the inherently recurring order, or you can reorder the list or
researcher chooses a simple random sample for the new adjust the sampling intervals, systematic sampling resem-
survey, he or she might not pick up a large-enough pro- bles simple random sampling.
portion of younger teachers to detect any differences that A cluster is a naturally occurring unit (e.g., a school,
matter. In fact, with really bad luck, by chance, the entire which has many classrooms, students, and teachers). Other
sample may consist of older teachers. To be sure that the clusters are universities, hospitals, cities, states, and so on.
sample consists of adequate proportions of people with The clusters are randomly selected (called cluster sam-
certain characteristics, the researcher can use stratified pling), and all members of the selected cluster are included
random sampling. in the sample. For example, suppose that California’s
A stratified random sample is one in which the popula- counties are trying out a new program to improve physical
tion is divided into subgroups or strata, and a random education for teens. A researcher who wanted to use cluster
sample is then selected from each subgroup. For example, sampling can consider each county as a cluster and select
suppose a survey researcher wants to determine the effec- and assign counties at random to the new physical educa-
tiveness of a program to care for the health of homeless tion program or to the traditional one. The programs in the
families. The researcher plans to survey a sample of 1800 of selected counties would then be the focus of the survey.
the 3000 family members who have participated in the Cluster sampling is used in large surveys. It differs
program. The researcher also intends to divide the family from stratified sampling in that with cluster sampling
members into groups according to their general health you start with a naturally occurring constituency. The
status (as indicated by scores on a 32-item test) and age. researcher then selects from among the clusters and either
Health status and age are the strata. The strata or sub- surveys all members of the selection or randomly selects
groups are chosen because evidence is available that they from among them. The resulting sample may not be
Survey Research Methods 157
representative of areas not covered by the cluster, nor the number of eligible respondents (denominator). Prac-
does one cluster necessarily represent another. tically all surveys are accompanied by a loss of informa-
Nonprobability samples do not guarantee that all eli- tion because of nonresponse. These nonresponses may
gible units have an equal chance of being included in introduce error into the survey’s results because of differ-
a sample. Their main advantage is that they are rela- ences between respondents and nonrespondents.
tively convenient, economical, and appropriate for many
surveys. Their main disadvantage is that they are vulner-
able to selection biases. Convenience sampling is a type of Reliable and Valid Survey Instruments
nonprobability sampling in which the researcher surveys
individuals who are ready and available. For example, a A reliable survey instrument is consistent; a valid one is
survey that relies on people in a shopping mall is using a correct (Litwin, 2004) For example, an instrument is reli-
convenience sample. able if each time you use it (and assuming no intervention),
Snowball sampling relies on previously identified you get the same information. But it may not be correct!
members of a group to identify other members of the Reliability or the consistency of information can be seri-
population. As newly identified members name others, ously imperiled by poorly worded and imprecise questions
the sample snowballs. This technique is used when a and directions. If an instrument is unreliable, it is also
population listing is unavailable and cannot be compiled. invalid because inconsistent data are incorrect. Valid survey
For example, surveys of teenage gang members and instruments serve the purpose they were intended to and
undocumented workers might be asked to participate provide correct information. For example, if a survey’s
in snowball sampling because no membership list is objective is to find out about mental health, the results
available. should be consistent with other measures of mental health
Quota sampling divides the population being studied and inconsistent with measures of mental instability. Valid
into subgroups such as male and female and younger and instruments are always reliable, too.
older. Then you estimate the proportion of people in each
subgroup (e.g., younger and older males and younger and
older females).
Reliability
Sample Size A reliable survey instrument is one that is relatively free
The size of the sample refers to the number of units that from measurement error. This error, caused individuals to
that will be surveyed to get precise and reliable findings. obtain scores that are different from their true scores,
The units can be people (e.g., boys and girls over and which can only be obtained from perfect measures.
under 16 years of age), places (e.g., counties, hospitals, What causes this error? In some cases, the error results
and schools), and things (e.g., medical or school records). from the measure itself – it may be difficult to understand
The number of needed units is influenced by a number of or poorly administered. For example, a self-administered
factors, including the purpose of the study, population questionnaire on the value of preventive healthcare might
size, the risk of selecting a bad sample, and the allowable produce unreliable results if its reading level is too high
sampling error (Kraemer and Thiemann, 1987; Cohen, for the teen mothers who are to use it. If the reading level
1988). When you increase the sample size, you increase is on target but the directions are unclear, the measure
the survey’s cost. Larger samples mean increased costs to will be unreliable. Of course, the survey researcher can
provide participants with financial or other incentives and simplify the language and clarify the directions and still
to follow up with nonresponders. find measurement error. This is because measurement
The most appropriate way to produce the right sample error can also come directly from the examinees. For
size is to use statistical calculations. These can be rela- example, if teen mothers are asked to complete a ques-
tively complex, depending on the needs of the survey. tionnaire and they are especially anxious or fatigued, their
Some surveys have just one sample, and others have obtained scores could differ from their true scores.
several. Formulas for calculating survey samples can be Four kinds of reliability are often discussed: stability,
found on the Internet. Type in the words, ‘‘sample size internal consistency, and inter- and intrarater reliability.
calculator.’’
Test–Retest Reliability or Stability
Response Rate A measure is stable if the correlation between scores from
one time to another is high. Suppose a survey of students’
All surveys hope for a high response rate. The response attitudes was administered to the same group of students
rate is the number that responds (numerator) divided by at School A in April and again in October. If the survey
158 Education Research Methodology: Quantitative Methods and Research
was reliable and no special program or intervention was sent to a clinic to observe waiting times, the appearance
introduced, then, on average, we would expect attitudes to of the waiting and examination rooms, and the general
remain the same. The major conceptual difficulty in estab- atmosphere. If the observers agreed perfectly on all items,
lishing test–retest reliability is in determining how much then interrater reliability would be perfect. Interrater reli-
time is permissible between the first and second adminis- ability is enhanced by training data collectors, providing
tration. If too much time elapses, external events might them with a guide for recording their observations, moni-
influence responses for the second administration; if too toring the quality of the data collection over time to see
little time passes, the respondents may remember and that people are not burning out, and offering a chance to
simply repeat their answers from the first administration. discuss difficult issues or problems. Intrarater reliability
When testing alternate-form reliability, the different refers to a single individual’s consistency of measurement,
forms may be administered at separate time points to and this, too, can be enhanced by training, monitoring, and
the same population. Alternatively, if the sample is large continuous education.
enough, it can be divided in half and each alternate form
administered to half the group. This technique, called the
split-halves method, is generally accepted as being as good
as administering the different forms to the same sample at Validity
different time points. When using the split-halves method,
you must make sure to select the half-samples randomly. Validity refers to the degree to which a survey instrument
assesses what it purports to measure. For example, a
survey of student attitude toward technological careers
Internal Consistency or Homogeneity would be an invalid measure if the survey only asked
about their knowledge of the newest advances in space
A survey’s internal consistency or homogeneity refers to technology. Similarly, an attitude survey will not be con-
the extent to which all the items or questions assess the sidered valid unless you can demonstrate that people who
same skill, characteristic, or quality. Cronbach’s coeffi- are identified as having a good attitude on the basis of
cient alpha, the average of all the correlations between their responses to the survey are different in some observ-
each item and the total score, is often calculated to deter- able way from people who are identified as dissatisfied.
mine the extent of homogeneity. For example, suppose a Four types of validity are often discussed: content, face,
survey researcher created a questionnaire to find out criterion, and construct.
about students’ satisfaction with Textbook A. An analysis
of homogeneity will tell the extent to which all items on
the questionnaire focus on satisfaction. Content Validity
Equivalence or alternate-form reliability is a type of
internal consistency. If two items measure the same con- Content validity refers to the extent to which a measure
cepts at the same level of difficulty, they are equivalent. thoroughly and appropriately assesses the skills or charac-
Suppose students were asked a question about their views teristics it is intended to measure. For example, a survey
toward technology before participating in a new computer researcher who is interested in developing a measure of
skills class and again 2 months after completing it. Unless mental health has to first define the concept (‘‘What is
the survey researcher was certain that the items on the mental health?’’ ‘‘How is health distinguished from disease?’’)
surveys were equal, more favorable views on technology and then write items that adequately contain all aspects of
after the second administration could reflect the survey’s the definition. The literature is often consulted either for
language level, for example, rather than improved views. a model or for a conceptual framework from which a
Some variables do not have a single dimension. Student definition can be derived because of the complexity of
satisfaction, for example, may consist of satisfaction with the task. It is not uncommon in establishing content valid-
school in general, their school in particular, teachers, clas- ity to see a statement like ‘‘We used XYZ cognitive theory
ses, extracurricular activities, and so on. If you are unsure to select items on mental health, and we adapted the ABC
of the number of dimensions expressed in an instrument, a role model paradigm for questions about social relations.’’
factor analysis can be performed. This statistical proce-
dure identifies factors or relationships among the items
or questions. Face Validity
Face validity refers to how a measure appears on the
surface: Does it seem to ask all the needed questions?
Inter- and Intrarater Reliability
Does it use the appropriate language and language level
Interrater reliability refers to the extent to which two to do so? Face validity, unlike content validity, does not
or more individuals agree. Suppose two individuals were rely on established theory for support.
Survey Research Methods 159
Litwin, M. (2004). How to Assess and Interpret Survey Psychometrics. Relevant Websites
Thousand Oaks, CA: Sage.
Sudman, S. and Bradburn, N. M. (1982). Asking Questions. San
Francisco, CA: Jossey-Bass. http://www.bmj.com – BMJ Publishing Group Ltd.
http://www.eric.gov – Education Resources Information Center.
http://nces.ed.gov – The National Center for Education Statistics.
http://scientific.thomson.com – The Thomson Scientific Database.
Further Reading http://www.census.gov – US Census Bureau.