0% found this document useful (0 votes)

47 views31 pages

Course Presentation

This document discusses measurement of constructs and testing the validity and reliability of measures. It defines constructs as theoretical concepts that need to be accurately measured. Conceptualization and operationalization are explained as processes to define constructs concretely and develop indicators to measure them. Different types of scales are presented, including nominal, ordinal, interval and ratio scales. Common social science rating scales like Likert, semantic differential and Guttman scales are also described. The document concludes by defining reliability as consistency of measures and discussing different types of reliability like inter-rater, test-retest, parallel forms and internal consistency reliability.

Uploaded by

Upendra D.C.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views31 pages

Course Presentation

Uploaded by

Upendra D.C.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Measurement of Constructs and

Testing Validity & Reliability of

Measures

Presented by:
Group 3: 22606
22616
22621
Measurement of
Constructs
Definition

 Theoretical propositions consists of relationships between abstract constructs.

 These constructs needs to be measured accurately, correctly and in scientific
manner.
 Measurement refers to careful, deliberate observations of real world in
empirical research.
 Examples of Constructs: person’s height, weight, firm’s size (easy to measure)
creativity, prejudice (hard to measure)

 Now let’s examine the related process of conceptualization and operationalization

for creating such measures of constructs.
Conceptualization

 Conceptualization is the mental process by which fuzzy and imprecise

constructs (concepts) and their constituent components are defined in
concrete and precise terms.
 For instance, we often use the word “prejudice” and the word conjures a certain
image in our mind; however, we may struggle if we were asked to define exactly
what the term meant. If someone says bad things about other racial groups, is that
racial prejudice? If women earn less than men for the same job, is that gender
prejudice?
 Operationalization refers to the process of developing indicators or items for
measuring constructs.
 For instance, if an unobservable theoretical construct such as socioeconomic status
is defined as the level of family income, it can be operationalized using an
indicator that asks respondents the question: what is your annual family income?
Levels of Measurement (Rating Scales)

Scale Central Tendency Statistics Transformations

One-to-one
Nominal Mode Chi-square
(equality)

Percentile,
Monotonic
Ordinal Median nonparametric
increasing (order)
statistics

Arithmetic mean, Correlation,

Positive linear
Interval range, standard regression, analysis
(affine)
deviation of variance

Geometric Positive
Coefficient of
Ratio mean, harmonic similarities(multipli
variation
mean cative, logarithmic)
Generic Type of Scale Measurement

 Nominal
 Ordinal
 Interval
 Ratio
Rating Scales for Social Science
Research
 Binary Scales
 Likert Scales
 Semantic Differential Scales
 Guttman Scales
Binary Scales
Have you ever written a letter to a public official Yes No
Have you ever signed a political petition Yes No
Have you ever donated money to a political cause Yes No
Have you ever donated money to a candidate running for public office
Yes No
Have you ever written a political letter to the editor of a newspaper or magazine
Yes No
Have you ever persuaded someone to change his/her voting plans
Yes No
 A six-item binary scale for measuring political activism
 Binary scales are nominal scales consisting of binary items that assume one of two
possible values, such as yes or no, true or false, and so on.
Likert’s Scale
I feel good about my job 1 2 3 4 5
I get along well with others at work 1 2 3 4 5
I’m proud of my relationship with my supervisor at work 1 2 3
4 5
I can tell that other people at work are glad to have me there 1 2 3
4 5
I can tell that my coworkers respect me 1 2 3 4 5
I feel like I make a useful contribution at work 1 2 3 4
5
1: Strongly Disagree 5: Strongly Agree
 A six-item Likert scale for measuring employment self-esteem
 This scale includes Likert items that are simply-worded statements to which
respondents can indicate their extent of agreement or disagreement on a five or
seven-point scale ranging from “strongly disagree” to “strongly agree”.
Semantic Differential Scale

Very much Somewhat Neither Somewhat Very much

Good
Bad
Useful
Useless
Caring
Uncaring
Interesting
Boring
 A semantic differential scale for measuring attitude toward national health
insurance
 This is a composite (multi-item) scale where respondents are asked to
indicate their opinions or feelings toward a single statement using
Guttman Scale

Do you mind immigrants being citizens of your country Yes

No
Do you mind immigrants living in your own neighborhood Yes
No
Would you mind living next door to an immigrant Yes
No
Would you mind having an immigrant as your close friend Yes
No
Would you mind if someone in your family married an immigrantYes No
 A five-item Guttman scale for measuring attitude toward immigrants

 This composite scale uses a series of items arranged in increasing order of

intensity of the construct of interest, from least intense to most intense.
Scaling

 The process of creating the indicators is called scaling.

 Stevens (1946) said, “Scaling is the assignment of objects to numbers
according to a rule.”
 Scales can be unidimensional or multidimensional, based on whether the
underlying construct is unidimensional (e.g., weight, wind speed, firm size) or
multidimensional (e.g., academic aptitude, intelligence).
Unidimensional and Multidimensional
Scaling
 Unidimensional scale measures constructs along a single scale,
ranging from high to low. Note that some of these scales may include
multiple items, but all of these items attempt to measure the same
underlying dimension.
 Multi-dimensional scales, employ different items or tests to measure
each dimension of the construct separately, and then combine the
scores on each dimension to create an overall measure of the
multidimensional construct.
 Example: academic aptitude can be measured using two separate tests of students’
mathematical and verbal ability, and then combining these scores to create an
overall measure for academic aptitude.
Indexes and Typologies

 An index is a composite score derived from aggregating measures of multiple

constructs (called components) using a set of rules and formulas. It is
different from scales in that scales also aggregate measures, but these
measures measure different dimensions or the same dimension of a single
construct .
 Scales and indexes generate ordinal measures of unidimensional constructs.
However, researchers sometimes wish to summarize measures of two or more
constructs to create a set of categories or types called a typology . Unlike
scales or indexes, typologies are multi-dimensional but include only nominal
variables.
Testing Validity and
Reliability of Measures
Reliability
 Reliability is the degree to which the measure of a construct is
consistent or dependable.
 A test is considered to be reliable if the result of multiple tests
are found to be same or compatible at different time intervals.
 Reliability implies consistency but not accuracy.
 Reliability can be obtained more using quantitative measures
rather than by “Observation” which is a qualitative
measurement technique.
Examples

 Guessing the person’s weight versus measuring the person’s weight using
weighing scale.
 Measuring Employee Morales by observing versus counting the grievances
filed by the employees.
 If we scored 95% on a test the first time and the next you score, 96%, our
results are reliable. So, even if there is a minor difference in the outcomes, as
long as it is within the error margin, our results are reliable.
Sources of Unreliability

 Observer’s or Researcher’s subjectivity.

 Asking imprecise or ambiguous questions.
 Asking questions about issues that respondents are not very familiar about or care
about.
Types of Reliability

 Inter-rater reliability (Inter-Observer Reliability)

 Test-retest reliability (Stability)
 Parallel forms reliability
 Split-half reliability
 Internal consistency reliability
Inter-rater reliability

 Inter-rater reliability (also called inter-observer reliability) measures the degree of

agreement between different people observing or assessing the same thing.
 We use it when data is collected by researchers assigning ratings, scores or categories
to one or more variables, and it can help mitigate observers bias.
For Example
In an observational study where a team of researchers collect data on classroom behavior,
inter-rater reliability is important: all the researchers should agree on how to categorize or
rate different types of behavior.
Test-retest reliability

 It measures the consistency of results when we repeat the same test on the same sample
at a different point in time.
 We use it when you are measuring something that you expect to stay constant in your
sample.
For Example:
A test of color blindness for trainee pilot applicants should have high test-retest reliability,
because color blindness is a trait that does not change over time.
Parallel form Reliability

 Parallel forms reliability measures the correlation between two equivalent versions of a
test.
 You use it when you have two different assessment tools or sets of questions designed
to measure the same thing.
For Example
If the same students take two different versions of a reading comprehension test, they
should get similar results in both tests.
Split-Half Reliability

 In this reliability we randomly split a set of measures into two sets. After testing the
entire set of the respondents, we calculate the correlation between the two sets of
responses.
 The longer is the instrument, the more likely it is that the two halves of the measure
will be similar (since random errors are minimized as more items are added), and
hence, this technique tends to systematically overestimate the reliability of longer
instruments.
Internal Consistency Reliability

 Internal consistency assesses the correlation between multiple items in a test that are
intended to measure the same construct.
 We can calculate internal consistency without repeating the test or involving other
researchers, so it’s a good way of assessing reliability when we only have one data set.
 This reliability can be estimated in terms of average inter-item correlation, average
item-to-total correlation, or more commonly, Cronbach’s alpha.
For Example
 To measure customer satisfaction with an online store, we can create a questionnaire
with a set of statements that respondents must agree or disagree with. Internal
consistency tells us whether the statements are all reliable indicators of customer
satisfaction.
Validity

 Validity is the extent to which a test measures what it claims to measure.

 It is vital for a test to be valid in order for the results to be accurately applied and
interpreted.
Types of Validity
 Face validity: It is the extent to which the measurement method appears "on
its face" to measure the construct of interest.
 For example, the frequency of one’s attendance at religious services seems
to make sense as an indication of a person’s religiosity without a lot of
explanation. Hence this indicator has face validity.
 Content Validity: It is the extent to which the method covers the entire
range of relevant behaviors, thoughts, and feelings that define the construct
being measured.
 For example , A course exam has good content validity if it covers all the
material that students are supposed to learn and poor content validity if it
does not.
 Convergent Validity: It involves assessing the degree of relatedness
between two scales that measure similar constructs.
 For example, Comparing Language Tests: Calculating the
correlation between IELTS TEST scores with GRE TEST scores.
 Discriminant Validity: It is the extent to which people's scores are
not correlated with other variables that reflect distinct constructs.
 For example, Calculating the correlation between scores on a math
exam and knowledge of English literature (correlation should be
zero)
 Confirmatory factor analysis (CFA) is a way to assess both
convergent and discriminant validity of several scales
simultaneously.
 Criterion Validity: It is the extent to which people's scores are
correlated with other variables or criteria that reflect the same
construct.
 For example: An occupational aptitude test should correlate
positively with work performance.
 Types of Criterion Validity
 Predictive Validity: A new measure of self-esteem should
correlate positively with an old established measure. When the
criterion is something that will happen or be assessed in the future,
this is called predictive validity.
 For example: Pre-hire test assessment that predicts job
performance after a year.
 Concurrent Validity: It offers a way of establishing a test’s
validity by comparing it to another similar test that is known to be
valid. If the two tests correlate, then the new study is believed to
also be valid.
Integrated Approach to Measure Validation

pr2 - q2 - Mod4 - Planning Data Analysis Using Statistics and Hypothesis Testing
80% (10)
pr2 - q2 - Mod4 - Planning Data Analysis Using Statistics and Hypothesis Testing
25 pages
STA301 - Midterm MCQS Solved With References by Moaaz PDF
50% (2)
STA301 - Midterm MCQS Solved With References by Moaaz PDF
28 pages
Data Measurement & Scaling Basics
No ratings yet
Data Measurement & Scaling Basics
28 pages
Chapter 4 Measurement and Sampling
No ratings yet
Chapter 4 Measurement and Sampling
46 pages
RM Unit 3
No ratings yet
RM Unit 3
34 pages
CHAPTER 6 - Scales and Measurement
No ratings yet
CHAPTER 6 - Scales and Measurement
22 pages
Unit IV - Measurement and Scaling - 64886f01 4f6a 4f71 Aecd 93bee01c1966
No ratings yet
Unit IV - Measurement and Scaling - 64886f01 4f6a 4f71 Aecd 93bee01c1966
71 pages
Research
No ratings yet
Research
44 pages
Measurement: Scaling, Reliability, Validity
100% (1)
Measurement: Scaling, Reliability, Validity
34 pages
Measurement in Business Research
No ratings yet
Measurement in Business Research
41 pages
Validity and Reliabily
No ratings yet
Validity and Reliabily
41 pages
Chapter 12 - Measurement Scaling Reliability and Validity 23112022 095259am 05052023 122711pm 4 06122023 112339am
No ratings yet
Chapter 12 - Measurement Scaling Reliability and Validity 23112022 095259am 05052023 122711pm 4 06122023 112339am
17 pages
Measurement - Scaling, Reliability, Validity
No ratings yet
Measurement - Scaling, Reliability, Validity
42 pages
Measurement: Measurement Is The Process of Observing and
No ratings yet
Measurement: Measurement Is The Process of Observing and
88 pages
7 - Measurement of VariablesScallingReliability
No ratings yet
7 - Measurement of VariablesScallingReliability
49 pages
Quantitative Data Collection Methods
No ratings yet
Quantitative Data Collection Methods
24 pages
Research Methodology On 8-2-19
No ratings yet
Research Methodology On 8-2-19
55 pages
Variable and Its Measurement Module 3-1
No ratings yet
Variable and Its Measurement Module 3-1
29 pages
Scaling Measurement
No ratings yet
Scaling Measurement
8 pages
BRM Unit 3: Measurement & Data
No ratings yet
BRM Unit 3: Measurement & Data
46 pages
Business Research Methods
No ratings yet
Business Research Methods
94 pages
Chapter 12 - Measurement-Scaling, Reliability and Validity
No ratings yet
Chapter 12 - Measurement-Scaling, Reliability and Validity
20 pages
NOTE 4 - Measurement and Reliability
No ratings yet
NOTE 4 - Measurement and Reliability
6 pages
Measurement of Variables: Operational
No ratings yet
Measurement of Variables: Operational
25 pages
Research Measurement Essentials
No ratings yet
Research Measurement Essentials
39 pages
Chapter 12: Measurement: Scaling, Reliability and Validity
No ratings yet
Chapter 12: Measurement: Scaling, Reliability and Validity
43 pages
Scales and Measurement
No ratings yet
Scales and Measurement
70 pages
Attitude Measurement and Scaling
No ratings yet
Attitude Measurement and Scaling
25 pages
Final Report of Practical Training Is The Product of The Concerted Effort and Help of A Number People in Make This Report Possible
No ratings yet
Final Report of Practical Training Is The Product of The Concerted Effort and Help of A Number People in Make This Report Possible
7 pages
Lecture 9 10 RM (Measurement and Measurement Scale)
No ratings yet
Lecture 9 10 RM (Measurement and Measurement Scale)
54 pages
Measurement and Scaling Guide
No ratings yet
Measurement and Scaling Guide
29 pages
Measuring Human Behaviour
No ratings yet
Measuring Human Behaviour
14 pages
Chapter 3
No ratings yet
Chapter 3
43 pages
RM 5
No ratings yet
RM 5
3 pages
Measurement & Scaling Techniques
No ratings yet
Measurement & Scaling Techniques
16 pages
Scales of Measurement
No ratings yet
Scales of Measurement
56 pages
Measurement: Scaling, Reliability and Validity
100% (1)
Measurement: Scaling, Reliability and Validity
42 pages
BRM Measurement and Scale 2019
No ratings yet
BRM Measurement and Scale 2019
50 pages
Instrumentation and Data Collection
No ratings yet
Instrumentation and Data Collection
60 pages
Importance of Statistics
No ratings yet
Importance of Statistics
4 pages
Research Methods For Business: A Skill-Building Approach
No ratings yet
Research Methods For Business: A Skill-Building Approach
26 pages
MEASUREMENT
No ratings yet
MEASUREMENT
30 pages
Unit 3
No ratings yet
Unit 3
42 pages
CH 13 Research
No ratings yet
CH 13 Research
34 pages
Research Methodology
33% (3)
Research Methodology
36 pages
BRM Chap 13
No ratings yet
BRM Chap 13
46 pages
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
No ratings yet
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
33 pages
Chapter Nine Measurement and Scaling: Noncomparative Scaling Techniques
No ratings yet
Chapter Nine Measurement and Scaling: Noncomparative Scaling Techniques
6 pages
Measurement Scaling Validity and Reliability
No ratings yet
Measurement Scaling Validity and Reliability
3 pages
Measurement & Scaling Essentials
No ratings yet
Measurement & Scaling Essentials
17 pages
Week 10 Measurement of Variables1
No ratings yet
Week 10 Measurement of Variables1
31 pages
Unit 5 Measure Scale Sample
No ratings yet
Unit 5 Measure Scale Sample
95 pages
Unit 9 Measurements - Short
No ratings yet
Unit 9 Measurements - Short
27 pages
Research Methodology
No ratings yet
Research Methodology
18 pages
5B Measurement
No ratings yet
5B Measurement
34 pages
Research Methodology: Presentation BY
No ratings yet
Research Methodology: Presentation BY
16 pages
Chapter 5 - Measurement Techniques
No ratings yet
Chapter 5 - Measurement Techniques
46 pages
RM Unit II Important Questions
No ratings yet
RM Unit II Important Questions
14 pages
2.STA 0542 2211 C L-2 Measurement and Scaling
No ratings yet
2.STA 0542 2211 C L-2 Measurement and Scaling
32 pages
Unit 2 Measurement Scales in Psychology
No ratings yet
Unit 2 Measurement Scales in Psychology
26 pages
Data Processing Compiled by Professor Olukunmi 'Lanre OLAITAN
No ratings yet
Data Processing Compiled by Professor Olukunmi 'Lanre OLAITAN
16 pages
Software Measurement Overview
No ratings yet
Software Measurement Overview
38 pages
Basic Statistics
No ratings yet
Basic Statistics
92 pages
Quantitative Analysis Paper
No ratings yet
Quantitative Analysis Paper
15 pages
AgriComp Dealer Satisfaction Study
100% (1)
AgriComp Dealer Satisfaction Study
4 pages
Den
No ratings yet
Den
15 pages
PR1 Chapter 1
100% (1)
PR1 Chapter 1
56 pages
Scaling Techniques
0% (1)
Scaling Techniques
36 pages
Ma Notes 2016
0% (1)
Ma Notes 2016
124 pages
Basic Statistical Concepts: Lesson 1
No ratings yet
Basic Statistical Concepts: Lesson 1
33 pages
Crim 7
No ratings yet
Crim 7
16 pages
Statistics Workshop for Educators
100% (1)
Statistics Workshop for Educators
80 pages
Descriptive Statistics - Book
No ratings yet
Descriptive Statistics - Book
101 pages
Statistical Inferences Notes
No ratings yet
Statistical Inferences Notes
15 pages
Stat Chapter 1
No ratings yet
Stat Chapter 1
4 pages
GEMMW01X Final Reviewer
No ratings yet
GEMMW01X Final Reviewer
11 pages
Survey Questionnaires
No ratings yet
Survey Questionnaires
140 pages
Types and Processess of Research
No ratings yet
Types and Processess of Research
17 pages
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
100% (3)
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
574 pages
Antim Prahar Business Research Methods
No ratings yet
Antim Prahar Business Research Methods
66 pages
DT Topic 1
No ratings yet
DT Topic 1
6 pages
Module 6 Lesson 1
No ratings yet
Module 6 Lesson 1
7 pages
Combining Qualitative and Quantitative Data Sampling and Analysis Mixed Methods
No ratings yet
Combining Qualitative and Quantitative Data Sampling and Analysis Mixed Methods
10 pages
Statistics Quiz on Central Tendency & Dispersion
No ratings yet
Statistics Quiz on Central Tendency & Dispersion
4 pages
Reseach
No ratings yet
Reseach
32 pages
Final Coaching - Social Science Gio Gabriel G. Quebral, LPT
No ratings yet
Final Coaching - Social Science Gio Gabriel G. Quebral, LPT
16 pages
Unit 1 - Intro (FLUID MECHANICS)
No ratings yet
Unit 1 - Intro (FLUID MECHANICS)
53 pages
Lec 1 - Biostatstics
No ratings yet
Lec 1 - Biostatstics
17 pages

Course Presentation

Uploaded by

Course Presentation

Uploaded by

Measurement of Constructs and

Testing Validity & Reliability of

 Theoretical propositions consists of relationships between abstract constructs.

 Now let’s examine the related process of conceptualization and operationalization

 Conceptualization is the mental process by which fuzzy and imprecise

Scale Central Tendency Statistics Transformations

Arithmetic mean, Correlation,

Very much Somewhat Neither Somewhat Very much

Do you mind immigrants being citizens of your country Yes

 This composite scale uses a series of items arranged in increasing order of

 The process of creating the indicators is called scaling.

 An index is a composite score derived from aggregating measures of multiple

 Observer’s or Researcher’s subjectivity.

 Inter-rater reliability (Inter-Observer Reliability)

 Inter-rater reliability (also called inter-observer reliability) measures the degree of

 Validity is the extent to which a test measures what it claims to measure.

You might also like