Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
47 views31 pages

Course Presentation

This document discusses measurement of constructs and testing the validity and reliability of measures. It defines constructs as theoretical concepts that need to be accurately measured. Conceptualization and operationalization are explained as processes to define constructs concretely and develop indicators to measure them. Different types of scales are presented, including nominal, ordinal, interval and ratio scales. Common social science rating scales like Likert, semantic differential and Guttman scales are also described. The document concludes by defining reliability as consistency of measures and discussing different types of reliability like inter-rater, test-retest, parallel forms and internal consistency reliability.

Uploaded by

Upendra D.C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views31 pages

Course Presentation

This document discusses measurement of constructs and testing the validity and reliability of measures. It defines constructs as theoretical concepts that need to be accurately measured. Conceptualization and operationalization are explained as processes to define constructs concretely and develop indicators to measure them. Different types of scales are presented, including nominal, ordinal, interval and ratio scales. Common social science rating scales like Likert, semantic differential and Guttman scales are also described. The document concludes by defining reliability as consistency of measures and discussing different types of reliability like inter-rater, test-retest, parallel forms and internal consistency reliability.

Uploaded by

Upendra D.C.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Measurement of Constructs and

Testing Validity & Reliability of


Measures

Presented by:
Group 3: 22606
22616
22621
Measurement of
Constructs
Definition

 Theoretical propositions consists of relationships between abstract constructs.


 These constructs needs to be measured accurately, correctly and in scientific
manner.
 Measurement refers to careful, deliberate observations of real world in
empirical research.
 Examples of Constructs: person’s height, weight, firm’s size (easy to measure)
creativity, prejudice (hard to measure)

 Now let’s examine the related process of conceptualization and operationalization


for creating such measures of constructs.
Conceptualization

 Conceptualization is the mental process by which fuzzy and imprecise


constructs (concepts) and their constituent components are defined in
concrete and precise terms.
 For instance, we often use the word “prejudice” and the word conjures a certain
image in our mind; however, we may struggle if we were asked to define exactly
what the term meant. If someone says bad things about other racial groups, is that
racial prejudice? If women earn less than men for the same job, is that gender
prejudice?
 Operationalization refers to the process of developing indicators or items for
measuring constructs.
 For instance, if an unobservable theoretical construct such as socioeconomic status
is defined as the level of family income, it can be operationalized using an
indicator that asks respondents the question: what is your annual family income?
Levels of Measurement (Rating Scales)

Scale Central Tendency Statistics Transformations

One-to-one
Nominal Mode Chi-square
(equality)

Percentile,
Monotonic
Ordinal Median nonparametric
increasing (order)
statistics

Arithmetic mean, Correlation,


Positive linear
Interval range, standard regression, analysis
(affine)
deviation of variance

Geometric Positive
Coefficient of
Ratio mean, harmonic similarities(multipli
variation
mean cative, logarithmic)
Generic Type of Scale Measurement

 Nominal
 Ordinal
 Interval
 Ratio
Rating Scales for Social Science
Research
 Binary Scales
 Likert Scales
 Semantic Differential Scales
 Guttman Scales
Binary Scales
Have you ever written a letter to a public official Yes No
Have you ever signed a political petition Yes No
Have you ever donated money to a political cause Yes No
Have you ever donated money to a candidate running for public office
Yes No
Have you ever written a political letter to the editor of a newspaper or magazine
Yes No
Have you ever persuaded someone to change his/her voting plans
Yes No
 A six-item binary scale for measuring political activism
 Binary scales are nominal scales consisting of binary items that assume one of two
possible values, such as yes or no, true or false, and so on.
Likert’s Scale
I feel good about my job 1 2 3 4 5
I get along well with others at work 1 2 3 4 5
I’m proud of my relationship with my supervisor at work 1 2 3
4 5
I can tell that other people at work are glad to have me there 1 2 3
4 5
I can tell that my coworkers respect me 1 2 3 4 5
I feel like I make a useful contribution at work 1 2 3 4
5
1: Strongly Disagree 5: Strongly Agree
 A six-item Likert scale for measuring employment self-esteem
 This scale includes Likert items that are simply-worded statements to which
respondents can indicate their extent of agreement or disagreement on a five or
seven-point scale ranging from “strongly disagree” to “strongly agree”.
Semantic Differential Scale

Very much Somewhat Neither Somewhat Very much

Good
Bad
Useful
Useless
Caring
Uncaring
Interesting
Boring
 A semantic differential scale for measuring attitude toward national health
insurance
 This is a composite (multi-item) scale where respondents are asked to
indicate their opinions or feelings toward a single statement using
Guttman Scale

Do you mind immigrants being citizens of your country Yes


No
Do you mind immigrants living in your own neighborhood Yes
No
Would you mind living next door to an immigrant Yes
No
Would you mind having an immigrant as your close friend Yes
No
Would you mind if someone in your family married an immigrantYes No
 A five-item Guttman scale for measuring attitude toward immigrants

 This composite scale uses a series of items arranged in increasing order of


intensity of the construct of interest, from least intense to most intense.
Scaling

 The process of creating the indicators is called scaling.


 Stevens (1946) said, “Scaling is the assignment of objects to numbers
according to a rule.”
 Scales can be unidimensional or multidimensional, based on whether the
underlying construct is unidimensional (e.g., weight, wind speed, firm size) or
multidimensional (e.g., academic aptitude, intelligence).
Unidimensional and Multidimensional
Scaling
 Unidimensional scale measures constructs along a single scale,
ranging from high to low. Note that some of these scales may include
multiple items, but all of these items attempt to measure the same
underlying dimension.
 Multi-dimensional scales, employ different items or tests to measure
each dimension of the construct separately, and then combine the
scores on each dimension to create an overall measure of the
multidimensional construct.
 Example: academic aptitude can be measured using two separate tests of students’
mathematical and verbal ability, and then combining these scores to create an
overall measure for academic aptitude.
Indexes and Typologies

 An index is a composite score derived from aggregating measures of multiple


constructs (called components) using a set of rules and formulas. It is
different from scales in that scales also aggregate measures, but these
measures measure different dimensions or the same dimension of a single
construct .
 Scales and indexes generate ordinal measures of unidimensional constructs.
However, researchers sometimes wish to summarize measures of two or more
constructs to create a set of categories or types called a typology . Unlike
scales or indexes, typologies are multi-dimensional but include only nominal
variables.
Testing Validity and
Reliability of Measures
Reliability
 Reliability is the degree to which the measure of a construct is
consistent or dependable.
 A test is considered to be reliable if the result of multiple tests
are found to be same or compatible at different time intervals.
 Reliability implies consistency but not accuracy.
 Reliability can be obtained more using quantitative measures
rather than by “Observation” which is a qualitative
measurement technique.
Examples

 Guessing the person’s weight versus measuring the person’s weight using
weighing scale.
 Measuring Employee Morales by observing versus counting the grievances
filed by the employees.
 If we scored 95% on a test the first time and the next you score, 96%, our
results are reliable. So, even if there is a minor difference in the outcomes, as
long as it is within the error margin, our results are reliable.
Sources of Unreliability

 Observer’s or Researcher’s subjectivity.


 Asking imprecise or ambiguous questions.
 Asking questions about issues that respondents are not very familiar about or care
about.
Types of Reliability

 Inter-rater reliability (Inter-Observer Reliability)


 Test-retest reliability (Stability)
 Parallel forms reliability
 Split-half reliability
 Internal consistency reliability
Inter-rater reliability

 Inter-rater reliability (also called inter-observer reliability) measures the degree of


agreement between different people observing or assessing the same thing.
 We use it when data is collected by researchers assigning ratings, scores or categories
to one or more variables, and it can help mitigate observers bias.
For Example
In an observational study where a team of researchers collect data on classroom behavior,
inter-rater reliability is important: all the researchers should agree on how to categorize or
rate different types of behavior.
Test-retest reliability

 It measures the consistency of results when we repeat the same test on the same sample
at a different point in time.
 We use it when you are measuring something that you expect to stay constant in your
sample.
For Example:
A test of color blindness for trainee pilot applicants should have high test-retest reliability,
because color blindness is a trait that does not change over time.
Parallel form Reliability

 Parallel forms reliability measures the correlation between two equivalent versions of a
test.
 You use it when you have two different assessment tools or sets of questions designed
to measure the same thing.
For Example
If the same students take two different versions of a reading comprehension test, they
should get similar results in both tests.
Split-Half Reliability

 In this reliability we randomly split a set of measures into two sets. After testing the
entire set of the respondents, we calculate the correlation between the two sets of
responses.
 The longer is the instrument, the more likely it is that the two halves of the measure
will be similar (since random errors are minimized as more items are added), and
hence, this technique tends to systematically overestimate the reliability of longer
instruments.
Internal Consistency Reliability

 Internal consistency assesses the correlation between multiple items in a test that are
intended to measure the same construct.
 We can calculate internal consistency without repeating the test or involving other
researchers, so it’s a good way of assessing reliability when we only have one data set.
 This reliability can be estimated in terms of average inter-item correlation, average
item-to-total correlation, or more commonly, Cronbach’s alpha.
For Example
 To measure customer satisfaction with an online store, we can create a questionnaire
with a set of statements that respondents must agree or disagree with. Internal
consistency tells us whether the statements are all reliable indicators of customer
satisfaction.
Validity

 Validity is the extent to which a test measures what it claims to measure.


 It is vital for a test to be valid in order for the results to be accurately applied and
interpreted.
Types of Validity
 Face validity: It is the extent to which the measurement method appears "on
its face" to measure the construct of interest.
 For example, the frequency of one’s attendance at religious services seems
to make sense as an indication of a person’s religiosity without a lot of
explanation. Hence this indicator has face validity.
 Content Validity: It is the extent to which the method covers the entire
range of relevant behaviors, thoughts, and feelings that define the construct
being measured.
 For example , A course exam has good content validity if it covers all the
material that students are supposed to learn and poor content validity if it
does not.
 Convergent Validity: It involves assessing the degree of relatedness
between two scales that measure similar constructs.
 For example, Comparing Language Tests: Calculating the
correlation between IELTS TEST scores with GRE TEST scores.
 Discriminant Validity: It is the extent to which people's scores are
not correlated with other variables that reflect distinct constructs.
 For example, Calculating the correlation between scores on a math
exam and knowledge of English literature (correlation should be
zero)
 Confirmatory factor analysis (CFA) is a way to assess both
convergent and discriminant validity of several scales
simultaneously.
 Criterion Validity: It is the extent to which people's scores are
correlated with other variables or criteria that reflect the same
construct.
 For example: An occupational aptitude test should correlate
positively with work performance.
 Types of Criterion Validity
 Predictive Validity: A new measure of self-esteem should
correlate positively with an old established measure. When the
criterion is something that will happen or be assessed in the future,
this is called predictive validity.
 For example: Pre-hire test assessment that predicts job
performance after a year.
 Concurrent Validity: It offers a way of establishing a test’s
validity by comparing it to another similar test that is known to be
valid. If the two tests correlate, then the new study is believed to
also be valid.
Integrated Approach to Measure Validation

You might also like