Chapter 11
Chapter 11
S TA G E 3 : M E A S U R E M E N T Q U E S T I O N S
Copyright 2022 © McGraw Hill LLC. All rights reserved. No reproduction or distribution without the prior written consent of McGraw Hill LLC.
11-1
Industry Thought
Leadership
David F. Harris
president, Insight and Measurement
author, The Complete Guide to Writing
Questionnaires: How to Get Better
Information for Better Decisions
©McGraw Hill 11-3
Exhibits
Nominal Classification (mutually exclusive and • Count (frequency distribution); mode as central
collectively exhaustive categories), but no tendency; No measure of dispersion.
order, distance, or natural origin • Used with other variables to discern patterns,
reveal relationships.
Ordinal Classification and order, but no distance or • Determination of greater or lesser value.
natural origin • Count (frequency distribution); median as
central tendency; nonparametric statistics.
Interval Classification, order, and distance (equal • Determination of equality of intervals or
intervals), but no natural origin differences.
• Count (frequency distribution); mean or median
as measure of central tendency; measure of
dispersion is standard deviation or interquartile
range; parametric tests.
Ratio Classification, order, distance, • Determination of equality of ratios.
and natural origin • Any of the above statistical operations, plus
multiplication and division; mean as central
tendency; coefficients of variation as measure
of dispersion.
Validity Validity is the extent to which a measurement question or scale actually measures what
we wish to measure (our investigative questions). If it has external validity, the data
generated by the question can be generalized across persons, settings, and times. If it
has internal validity, it has content, criterion-related, and construct validity.
• Content validity is the degree that the measurement instrument provides adequate
coverage of the investigative questions.
• Criterion-related validity reflects that the measurement instrument can be used to
predict or estimate some property or object.
• Construct validity reflects that the measurement instrument used in the study for the
purpose used compares favorably to other instruments purporting to measure the
same thing (convergent validity).
Practicality Practicality means that the measurement instrument can be used economically (within
budget), is convenient (easy to administer), and the instrument is interpretable (all
information necessary to interpret the data is known about the instrument and its
process).
Peacock Desire to be perceived as smarter, wealthier, Respondents who claim to shop Harrods in London
happier, or better than others. (twice as many as those who do).
Pleaser Desire to help by providing answers they Respondents give a politically correct or assumed correct
think the researchers want to hear, to please answer about degree to which they revere their elders,
or avoid offending or being socially respect their spouse, etc.
stigmatized.
Gamer Adaption of answers to play the system. Participants who fake membership to a specific
demographic to participate in high remuneration study;
that they drive an expensive car when they don’t or that
they have cancer when they don’t.
Disengager Don’t want to think deeply about a subject. Participants who falsify ad recall or purchase behavior
(didn’t recall or didn’t buy) when they actually did.
Self-delusionist Participants who lie to themselves. Respondents who falsify behavior, such as thelevel they
recycle.
Unconscious Participants who are dominated by irrational Respondents who cannot predict with any certainty their
decision maker decision making. future behavior.
Ignoramus Participant who never knew or doesn’t Respondents who provide false information—such as
remember an answer. they can’t identify on a map where they live or remember
what they ate for supper the previous evening.
Step 1 Collect a large number of statements that meet the following criteria
• Each statement is relevant to the attitude being studied.
• Each statement reflects a favorable or unfavorable position on that attitude.
Step 2 Select people similar to study participants (participant stand-ins) to read each statement.
Step 3 Participant stand-ins indicate their level of their agreement with each statement, using a 5-point
scale. A scale value of 1 indicates a strongly unfavorable attitude (strongly disagree). A value of 5
indicates a strongly favourable attitude (strongly agree). The other intensities, 2 (disagree), 3
(neither agree nor disagree), 4 (agree), are mid-range attitudes (see Exhibit 11-3).
• To ensure consistent results, the assigned numerical values are reversed if the statement is
worded negatively. The number 1 is always strongly unfavorable and 5 is always strongly
favorable.
Step 4 Add each participant stand-in’s responses to secure a total score.
Step 5 Array these total scores from highest to lowest; then and select some portion—generally defined
as the top and bottom 10 to 25 percent of the distribution—to represent the highest and lowest
total scores.
• The two extreme groups represent people with the most favorable and least favorable
attitudes toward the attitude being studied. These extremes are the two criterion groups by
which individual Likert statements (items) are evaluated.
• Discard the middle group’s scores (50 to 80 percent of participant stand-ins), as they are not
highly discriminatory on the attitude.
Step 6 Calculate the mean scores for each scale item among the low scorers and
high scorers.
Step 7 Test the mean scores for statistical significance by computing a t value for
each statement.
Step 8 Rank order the statements by their t values from highest to lowest.
Step 9 Select 20–25 statements (items) with the highest t values (statistically
significant difference between mean scores) to include in the final question
using the Likert scale.
Subcategories of Evaluation
Meek Goodness Dynamic Goodness Dependable Goodness Hedonistic Goodness
Clean–Dirty Successful–Unsuccessful True–False Pleasurable–Painful
Kind–Cruel High–Low Reputable–Disreputable Beautiful–Ugly
Sociable–Unsociable Meaningful–Meaningless Believing–Skeptical Sociable–Unsociable
Light–Dark Important–Unimportant Wise–Foolish Meaningful–Meaningless
Altruistic–Egotistical Progressive–Regressive Healthy–Sick
Grateful–Ungrateful Clean–Dirty
Beautiful–Ugly
Harmonious–Dissonant
©McGraw Hill 11-20
Exhibit 11-13: Adapting SD Scales for Retail
Store Image Study
Step 1 Select the variable; chosen by judgment and reflects the nature of the investigative
question.
Step 2 Identify possible nouns, noun phrases, adjectives, or visual stimuli to represent the variable.
Step 3 Select bipolar word pairs, phrase pairs, or visual pairs appropriate to assess the object or
property. If the traditional Osgood adjectives are used, several criteria guide your selection:
• Choose adjectives that allow connotative perceptions to be expressed.
• Choose three bipolar pairs for each dimension: evaluation, potency, and activity. (Scores on
the individual items can be averaged, by factor, to improve reliability.)
• Choose pairs that will be stable across participants and variables. (One pair that fails this
test is “large–small”; may describe a property when judging a physical object such as
automobile but may be used connotatively with abstract concepts such as product quality.)
• Choose pairs that are linear between polar opposites and passes through the origin. (A pair
that fails this test is “rugged–delicate,” which is nonlinear as both objectives have favorable
meanings.)
Step 4 Create the scoring system and assign a positive value to each point on the scale. (Most SD
scales have
7 points with values of 7, 6, 5, 4, 3, 2, and 1. A “0” point is arbitrary.)
Step 5 Randomly select half the pairs and reverse score them to minimize the halo effect.
Step 6 Order the bipolar pairs so all representing a single dimension (e.g. evaluation) are not together
in the final measurement question.
Paired-comparison data may be treated in several ways. If there is substantial consistency, we will find that
if A is preferred to B, and B to C, then A will be consistently preferred to C. This condition of transitivity need
not always be true but should occur most of the time. When it does, take the total number of preferences
among the comparisons as the score for that stimulus. Assume a manager is considering five distinct
packaging designs. She would like to know how heavy users would rank these designs. One option would
be to ask a sample of the heavy-users segment to pair-compare the packaging designs. With a rough
comparison of the total preferences for each option, it is apparent that B is the most popular.
Designs
A B C D E
A — 164* 138 50 70
B 36 — 54 14 30
C 62 146 — 32 50
D 150 186 168 — 118
E 130 170 150 82 —
Total 378 666 510 178 268
Rank order 3 1 2 2 4
* Interpret this cell as 164 of 200 customers preferred suggested design B (column) to design A (row).
©McGraw Hill 11-26
Exhibit 11-19: Ideal Scalogram Response
Pattern
Item 2 Item 4 Item 1 Item 3 Participant Score
X X X X 4
— X X X 3
— — — X 2
— — — X 1
— — — — 0
* X = agree; — = disagree.
Case Item 2 Item 4 Item 1 Item 3 Participant Score
1 X X X X 4
2 — X X X 3
3 — — X X 2
4 — — — X 1
5 — — — — 0
Intra-store boutiques?
What scale?
Mobile optimized
research
PRE-CIIM: Management
Dilemma
Ads crafted: used cultural
perceptions of designers.
Solicit ad submissions
Activate survey.
Cultural Pillars
Cultural Acknowledgement
Cultural Respect
Positive Reflections
Cultural Value
Authentic Portrayal
Celebrated Culture:
Cultural Pride
Compare pricing
Visit website
Visit store
©McGraw Hill 11-48
PicProfile: Gen Z
34
David F. Harris
president, Insight and Measurement
author, The Complete Guide to Writing Questionnaires: How
to Get Better Information for Better Decisions
Free-response
Question Structure
Structured
Communication Skill
Participant Motivation
Types of information
• Willingly Shared, Conscious-level.
• Reluctantly shared, conscious-level.
• Knowable, limited-conscious-level.
• Subconscious-level.
Nominal Classification (mutually exclusive and • Count (frequency distribution); mode as central
collectively exhaustive categories), but no tendency; No measure of dispersion.
order, distance, or natural origin • Used with other variables to discern patterns,
reveal relationships.
Ordinal Classification and order, but no distance or • Determination of greater or lesser value
natural origin • Count (frequency distribution); median as
central tendency; nonparametric statistics.
Interval Classification, order, and distance (equal • Determination of equality of intervals or
intervals), but no natural origin differences.
• Count (frequency distribution); mean or median
as measure of central tendency; measure of
dispersion is standard deviation or interquartile
range; parametric tests.
Ratio Classification, order, distance, • Determination of equality of ratios.
and natural origin • Any of the above statistical operations, plus
multiplication and division; mean as central
tendency; coefficients of variation as measure
of dispersion.
Measure characteristics
of participants
Use participants as
judges
Response Types
Ranking Questions
Categorization Questions
Sorting Questions
Unidimensional
Multidimensional
• Adjust strength of
descriptive adjectives.
• Space intermediate
• Error of Central descriptive phrases
Tendency. farther apart.
• Error of Leniency. • Provide smaller
• Error of Strictness. differences in meaning
between terms near the
ends of the scale.
• Use more scale points.
Reverse order of
• Primacy Effect alternatives periodically
• Recency Effect or randomly
Peacock Desire to be perceived as smarter, wealthier, Respondents who claim to shop Harrods in London
happier, or better than others. (twice as many as those who do).
Pleaser Desire to help by providing answers they Respondents give a politically correct or assumed correct
think the researchers want to hear, to please answer about degree to which they revere their elders,
or avoid offending or being socially respect their spouse, etc.
stigmatized.
Gamer Adaption of answers to play the system. Participants who fake membership to a specific
demographic to participate in high remuneration study;
that they drive an expensive car when they don’t or that
they have cancer when they don’t.
Disengager Don’t want to think deeply about a subject. Participants who falsify ad recall or purchase behavior
(didn’t recall or didn’t buy) when they actually did.
Self-delusionist Participants who lie to themselves. Respondents who falsify behavior, such as thelevel they
recycle.
Unconscious Participants who are dominated by irrational Respondents who cannot predict with any certainty their
decision maker decision making. future behavior.
Ignoramus Participant who never knew or doesn’t Respondents who provide false information—such as
remember an answer. they can’t identify on a map where they live or remember
what they ate for supper the previous evening.
Rating questions
Ranking Questions
Categorization Questions
Sorting Questions
What newspaper do
you read most often for
financial news?
• East City Gazette.
• West City Tribune.
• Regional newspaper
• National newspaper.
• Other
(specify:_________).
Collect Statements
Step 1 Collect a large number of statements that meet the following criteria
• Each statement is relevant to the attitude being studied.
• Each statement reflects a favorable or unfavorable position on that attitude.
Step 2 Select people similar to study participants (participant stand-ins) to read each statement.
Step 3 Participant stand-ins indicate their level of their agreement with each statement, using a 5-point
scale. A scale value of 1 indicates a strongly unfavorable attitude (strongly disagree). A value of 5
indicates a strongly favourable attitude (strongly agree). The other intensities, 2 (disagree), 3
(neither agree nor disagree), 4 (agree), are mid-range attitudes (see Exhibit 11-3).
• To ensure consistent results, the assigned numerical values are reversed if the statement is
worded negatively. The number 1 is always strongly unfavorable and 5 is always strongly
favorable.
Step 4 Add each participant stand-in’s responses to secure a total score.
Step 5 Array these total scores from highest to lowest; then and select some portion—generally defined
as the top and bottom 10 to 25 percent of the distribution—to represent the highest and lowest
total scores.
• The two extreme groups represent people with the most favorable and least favorable
attitudes toward the attitude being studied. These extremes are the two criterion groups by
which individual Likert statements (items) are evaluated.
• Discard the middle group’s scores (50 to 80 percent of participant stand-ins), as they are not
highly discriminatory on the attitude.
Step 6 Calculate the mean scores for each scale item among the low scorers and
high scorers.
Step 7 Test the mean scores for statistical significance by computing a t value for
each statement.
Step 8 Rank order the statements by their t values from highest to lowest.
Step 9 Select 20–25 statements (items) with the highest t values (statistically
significant difference between mean scores) to include in the final question
using the Likert scale.
by Item
Analysis
Step 1 Select the variable; chosen by judgment and reflects the nature of the investigative question.
Step 2 Identify possible nouns, noun phrases, adjectives, or visual stimuli to represent the variable.
Step 3 Select bipolar word pairs, phrase pairs, or visual pairs appropriate to assess the object or
property. If the traditional Osgood adjectives are used, several criteria guide your selection:
• Choose adjectives that allow connotative perceptions to be expressed.
• Choose three bipolar pairs for each dimension: evaluation, potency, and activity. (Scores on
the individual items can be averaged, by factor, to improve reliability.)
• Choose pairs that will be stable across participants and variables. (One pair that fails this
test is “large–small”; may describe a property when judging a physical object such as
automobile but may be used connotatively with abstract concepts such as product quality.)
• Choose pairs that are linear between polar opposites and passes through the origin. (A pair
that fails this test is “rugged–delicate,” which is nonlinear as both objectives have favorable
meanings.)
Step 4 Create the scoring system and assign a positive value to each point on the scale. (Most SD
scales have
7 points with values of 7, 6, 5, 4, 3, 2, and 1. A “0” point is arbitrary.)
Step 5 Randomly select half the pairs and reverse score them to minimize the halo effect.
Step 6 Order the bipolar pairs so all representing a single dimension (e.g. evaluation) are not together
in the final measurement question.
Rating questions
Ranking Questions
Categorization Questions
Sorting Questions
Paired-comparison
Forced ranking
Comparative
Designs
A B C D E
A — 164* 138 50 70
B 36 — 54 14 30
C 62 146 — 32 50
D 150 186 168 — 118
E 130 170 150 82 —
Total 378 666 510 178 268
Rank order 3 1 2 2 4
* Interpret this cell as 164 of 200 customers preferred suggested design B (column) to design A (row).
©McGraw Hill 11-113
Forced Ranking Question
* X = agree; — = disagree.
Rating questions
Ranking Questions
Categorization Questions
Sorting Questions
Rating questions
Ranking Questions
Categorization Questions
Sorting Questions
Between 7-11
Sort cards into piles Structured sort
Unstructured Sort
Question Coverage
Question wording
Question Frame of
Reference
Response
Alternatives
Question Coverage
Question wording
Response Alternatives
Personalization
Shared vocabulary
Leading questions
Double-barreled questions
Unsupported assumptions
©McGraw Hill 11-121
Find or Craft Measurement Questions 3
Question Coverage
Question wording
Question Frame of Reference
Response Alternatives
Role
Behavior time frame
Behavior cycle
Behavior frequency
Memory Decay
©McGraw Hill 11-122
Find or Craft Measurement Questions 4
Question Coverage
Question wording
Question Frame of Reference
Response Alternatives
Free-response
Structured
90% of responses
Recency Effect
Primacy Effect
Central Tendency
©McGraw Hill 11-123
Exhibit 11-20: Summary of Issues Related to
Measurement Questions 4
Participant Research
Surrogates Colleagues
Unanswerable questions
Difficult-to-answer Questions
S TA G E 3 : M E A S U R E M E N T Q U E S T I O N S
Copyright 2022 © McGraw Hill LLC. All rights reserved. No reproduction or distribution without the prior written consent of McGraw Hill LLC.
11-129
Key Terms 1
S TA G E 3 : M E A S U R E M E N T Q U E S T I O N S
Copyright 2022 © McGraw Hill LLC. All rights reserved. No reproduction or distribution without the prior written consent of McGraw Hill LLC.
11-135