0% found this document useful (0 votes)

16 views34 pages

Validity Reliability Unit 2

The document discusses the concepts of validity and reliability in research, emphasizing their importance in ensuring accurate and consistent measurement through various types of validity such as face, content, construct, and criterion-related validity. It also outlines methods for assessing reliability, including test-retest, parallel-forms, inter-rater, and internal consistency reliability. The text highlights the significance of these concepts in research methodology to enhance the quality and credibility of survey instruments.

Uploaded by

sdeepakkumar870

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views34 pages

Validity Reliability Unit 2

Uploaded by

sdeepakkumar870

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Validity & Reliability

Dr. P.Routray
Context
 The main objective of questionnaire in
research is to obtain relevant information in
most reliable and valid manner.
 Thus the accuracy and consistency of

survey/questionnaire forms a significant

aspect of research methodology which are
known as validity and reliability.
Face Validity

Content Validity
Validity Discriminant
Validity
Construct
Validity
Convergence
Validity

Predictive
Validity
Criterion
Validity Concurrent
Validity

Postdictive
Validity
VALIDITY
 Research validity in surveys relates to the
extent at which the survey measures right
elements that need to be measured.
 In simple terms, validity refers to how well

an instrument as measures what it is

intended to measure.
Face Validity
 Face validity is a subjective judgment on the operationalization of a
construct.
 Cohen’s kappa measures the agreement between two raters who each
classify N items into C mutually exclusive categories.
 In order to examine the face validity, the dichotomous scale can be used
with categorical option of “Yes” and “No” which indicate a favourable and
unfavourable item respectively. Then the collected data is analysed using
Cohen’s Kappa Index (CKI) in determining the face validity of the
instrument.
 Face validity is considered by some as a basic and a very minimum index
of content validity.
 The value for kappa can range between 0 -1. A score of 0 means that there
is random agreement among raters, whereas a score of 1 means that there
is a complete agreement between the raters.DM. et al. (1975)
recommended a minimally acceptable Kappa of 0.60 for inter-rater
agreement.
A => The total number of instances that both
raters said were correct. The Raters are in
agreement.
B => The total number of instances that Rater
1 said were incorrect, but Rater 2 said were (A + D)/(A + B+ C+ D)
correct. This is a disagreement.
C => The total number of instances that Rater
2 said were incorrect, but Rater 1 said were
correct. This is also a disagreement.
D => The total number of instances that both
Raters said were incorrect. Raters are in
agreement.
Content validity
 Content validity is defined as “the degree to which
items in an instrument reflect the content
universe to which the instrument will be
generalized” (Straub, Boudreau et al. 2004).
 It is highly recommended to apply content validity
while the new instrument is developed.
 In general, content validity involves evaluation of
a new survey instrument in order to ensure that it
includes all the items that are essential and
eliminates undesirable items to a particular
construct domain (Lewis et al., 1995, Boudreau et
al., 2001).
Steps involved in Content
Validity
 An exhaustive literature reviews to extract the
related items.
 A content validity survey is generated (each item is
assessed using three point scale (not necessary,
useful but not essential and essential).
 The survey should sent to the experts in the same
field of the research.
 The content validity ratio (CVR) is then calculated for
each item by employing Lawshe (1975) ‘s method.
 Items that are not significant at the critical level are
eliminated.
Construct validity
 Construct validity refers to how well you translated or transformed
a concept, idea, or behaviour that is a construct into a functioning
and operating reality, the operationalization.
 Construct validity testifies to how well the results obtained from the

use of the measure fit the theories around which the test is
designed.
 Construct validity has two components: Convergent and

Discriminant(Divergent)validity.
 Discriminant validity is the extent to which latent variable A

discriminates from other latent variables (e.g., B, C, D).

 Discriminant validity means that a latent variable is able to account

for more variance in the observed variables associated with it than

a) measurement error or similar external, unmeasured
influences;
orb) other constructs within the conceptual framework.
 If this is not the case, then the validity of the individual indicators

and of the construct is questionable.

 In brief, Discriminant validity (or divergent validity) tests that

constructs that should have no relationship do, in fact, not have

Construct validity
 Construct validity refers to how well a test or tool measures the construct that it
was designed to measure. There are two types of construct validity: convergent and
discriminant validity
 Convergent validity tests that constructs that are expected to be related.
Convergent validity is established when the scores obtained with two different
instruments measuring the same concept are highly correlated.
 With the purpose of verifying the construct validity (discriminant and convergent
validity), a factor analysis can be conducted utilizing principal component analysis
(PCA) with varimax rotation method. Items loaded above 0.40, which is the
minimum recommended value in research are considered for further analysis.
 Also, items cross loading above 0.40 should be deleted. Therefore, the factor
analysis results will satisfy the criteria of construct validity including both the
discriminant validity (loading of at least 0.40, no cross-loading of items above 0.40)
and convergent validity (eigenvalues of 1, loading of at least 0.40, items that load
on posited constructs)
Criterion-related validity
 Criterion-related validity is established when the measure differentiates
individuals on a criterion it is expected to predict. A test has this type of
validity if it is useful for predicting performance or behavior in another situation
(past, present, or future)
 Criterion validity reflects the use of a criterion - a well-established
measurement procedure - to create a new measurement procedure to measure
the construct you are interested in. The criterion and the new measurement
procedure must be theoretically related. The measurement procedures could
include a range of research methods (e.g., surveys, structured observation, or
structured interviews, etc.), provided that they yield quantitative data.
 This can be done by establishing postdictive validity, concurrent validity or
predictive validity, as explained below.
 Postdictive validity: if the test is a valid measure of something that happened
before
 Concurrent validity is established when the same construct is measured by two
different instruments at same time and the results are consistent
 Predictive validity indicates the ability of the measuring instrument(e.g. job
knowledge test) to differentiate among individuals with reference to a future
criterion(e.g. job performance).Strong consistent relationship talk about high
predictive validity
Case-in -point
For market researchers, criterion validity is crucial, and can
make or break a product. One famous example is when Coca-
Cola decided to change the flavor of their trademark drink.
Diligently, they researched whether people liked the new flavor,
performing taste tests and giving out questionnaires. People
loved the new flavor, so Coca-Cola rushed New Coke into
production, where it was a titanic flop.
The mistake that Coke made was that they forgot about
criterion validity, and omitted one important question from the
survey.
People were not asked if they preferred the new flavor to the
old, a failure to establish concurrent validity.
The Old Coke, known to be popular, was the perfect
benchmark, but it was never used. A simple blind taste test,
asking people which flavor they preferred out of the two, would
have saved Coca Cola millions of dollars.
Ultimately, the predictive validity was also poor, because their
good results did not correlate with the poor sales. By then, it
was too late!
External vs Internal
validity
 External validity refers to the extent of generalizability of
the results of a causal study to other settings, people, or
events.
 Internal validity refers to the degree of our confidence in
the causal effects (i.e., that variable X causes variable Y).
 Field experiments have more external validity (i.e., the
results are more generalizable to other similar
organizational settings), but less internal validity (i.e., we
cannot be certain of the extent to which variable X alone
causes variable Y).
 Note that in the lab experiment, the reverse is true. The
internal validity is high but the external validity is rather
low.
Factors affecting internal validity
 The seven major threats to internal validity are the effects of history, maturation, testing,
instrumentation, selection, statistical regression, and mortality.
 Certain events or factors that would have an impact on the independent variable–
dependent variable relationship might unexpectedly occur while the experiment is in
progress, and this history of events would confound the cause-and-effect relationship
between the two variables, thus affecting the internal validity.
 Cause-and-effect inferences can also be contaminated by the effects of the pas- sage of
time—another uncontrollable variable. Such contamination is called maturation effects.
 Frequently, to test the effects of a treatment, subjects are given what is called a pretest
(say, a short questionnaire eliciting their feelings and attitudes). That is, first a measure of
the dependent variable is taken (the pretest), then the treatment given, and after that a
second test, called the posttest, administered. The difference between the posttest and
the pretest scores is then attributed to the treatment.
 Instrumentation effects are yet another source of threat to internal validity. These might
arise because of a change in the measuring instrument between pretest and posttest, and
not because of the treatment‘s differential impact at the end .
 The threat to internal validity could also come from improper or unmatched selection of
subjects for the experimental and control groups.
 The effects of statistical regression are brought about when the members chosen for the
experimental group have extreme scores on the dependent variable to begin with.
 Another confounding factor on the cause-and-effect relationship is the mortality or
attrition of the members in the experimental or control group or both, as the experiment
progresses. When the group composition changes over time across the groups,
comparison between the groups becomes difficult, because those who dropped out of the
experiment may confound the results
Factors affecting External Validity
 Internal validity raises questions about whether
it is the treatment alone or some additional
extraneous factor that causes the effects,
external validity raises issues about the
generalizability of the findings to other settings.
 Thus, subject selection and its interaction with
the treatment would also pose a threat to
external validity.
 Maximum external validity can be obtained by
ensuring that, as far as possible, the lab
experimental conditions are as close to and
compatible with the real-world situation.
Validity in Lab Experiment
 Internal validity refers to the confidence we place in the cause-and-
effect relationship. In other words, it addresses the question, ―To
what extent does the research design permit us to say that the
independent variable A causes a change in the dependent
variable B?
 As Kidder and Judd (1986) note, in research with high
internal validity, we are relatively better able to argue that
the relationship is causal, whereas in studies with low
internal validity, causality cannot be inferred at all. In lab
experiments where cause-and-effect relationships are
substantiated, internal validity can be said to be high.
 To what extent would the results found in the lab setting be
transferable or generalizable to the actual organizational or field
settings? In other words, if we do find a cause-and-effect relationship
after conducting a lab experiment, can we then confidently say that
the same cause-and-effect relationship will also hold true in the
organizational setting?
Reliability
 The reliability of a measure is an indication of the
stability and consistency with which the
instrument measures the concept and helps to assess
the ―goodness of a measure.
 The ability of a measure to remain the same over
time—despite uncontrollable testing conditions or the
state of the respondents themselves—is indicative of
its stability and low vulnerability to changes in the
situation. Two tests of stability are test–retest
reliability and parallel-form reliability.
 The internal consistency of measures is indicative of
the homogeneity of the items in the measure that tap
the construct. Inter-item consistency reliability and
split-half reliability are two test for consistency.
Reliability
Reliability is stability and consistency
 across time(test retest reliability)
 across instruments(Parallel-form reliability)
 across researchers(inter-rater reliability )
 across items(internal consistency)
Test-Retest Reliability

 Used to assess the consistency of a measure from one time to

another.
 This approach assumes that there is no substantial change in the
construct being measured between the two occasions. The
amount of time allowed between measures is critical.
 The shorter the time gap, the higher the correlation; the longer
the time gap, the lower the correlation.
 This is because the two observations are related over time -- the
closer in time we get the more similar the factors that contribute
to error
Parallel-Forms Reliability

 Used to assess the consistency of the results of two tests

constructed in the same way from the same content
domain.
 In parallel forms reliability you first have to create two
parallel forms. One way to accomplish this is to create a
large set of questions that address the same construct and
then randomly divide the questions into two sets.
 this approach makes the assumption that the randomly
divided halves are parallel or equivalent
Inter-Rater or Inter-Observer Reliability

 Used to assess the degree to which different raters/observers

give consistent estimates of the same phenomenon.
 Inter-rater reliability is especially relevant when the data are
obtained through observations, projective tests, or
unstructured interviews, all of which are liable to be
subjectively interpreted.
 There are two major ways to actually estimate inter-rater
reliability.
 Measure is in category scale
 Measure is a continuous one
Internal Consistency Reliability

 Used to assess the consistency of results across items within a

test.
 In internal consistency reliability estimation single
measurement instrument administered to a group of people on
one occasion to estimate reliability.
 In effect the reliability of the instrument by estimating how well
the items that reflect the same construct yield similar results.
 looking at how consistent the results are for different items for
the same construct within the measure.
 There are a wide variety of internal consistency measures that
can be used like
 Average Inter-item Correlation
 Average Item total Correlation
 Split-Half Reliability
 Cronbach's Alpha (a)
Average Inter-item
Correlation
 The average inter-item correlation uses all
of the items on our instrument that are
designed to measure the same construct.
 First the correlation between each pair of

items, as illustrated in the figure.

 The average inter-item correlation is simply

the average or mean of all these

correlations.
Average Item total Correlation

 This approach also uses the inter-item

correlations.
 In addition a total score for the six items and use
that as a seventh variable in the analysis.
 The figure shows the six item-to-total correlations
at the bottom of the correlation matrix. They
range from .82 to .88 in this sample analysis, with
the average of these at .85.
Split-Half Reliability

 In split-half reliability we randomly divide all

items that purport to measure the same construct
into two sets.
 We administer the entire instrument to a sample
of people and calculate the total score for each
randomly divided half.
 The split-half reliability estimate, as shown in the
figure, is simply the correlation between these
two total scores. In the example it is .87
Cronbach's Alpha (a)

 Cronbach's Alpha is mathematically

equivalent to the average of all possible split-
half estimates, although that's not how we
compute it.
Comparison of Reliability Estimators

 Each of the reliability estimators has certain

advantages and disadvantages.
 Inter-rater reliability is one of the best ways
to estimate reliability when your measure is
an observation.
 However, it requires multiple raters or
observers.
 As an alternative, one could look at the
correlation of ratings of the same single
observer repeated on two different occasions
Validity and Reliability of
Secondary data
 Data that do not measure latent constructs,
not required for factor analysis for
determining validity.
 Reliability is required to be measured when

the same construct is measured through

different indicators.

Validity Seminar
No ratings yet
Validity Seminar
14 pages
Week 3 Goodness of Measure
No ratings yet
Week 3 Goodness of Measure
12 pages
Chapter 5 - Validity and Reliability V2
No ratings yet
Chapter 5 - Validity and Reliability V2
32 pages
Validity 2
No ratings yet
Validity 2
3 pages
Validity
100% (2)
Validity
17 pages
Measurement
No ratings yet
Measurement
8 pages
6 ReliabilityandValidity
No ratings yet
6 ReliabilityandValidity
27 pages
Types of Validity
100% (1)
Types of Validity
5 pages
Validity AND Reliability of Questionnaires: Dr. R. Venkitachalam
No ratings yet
Validity AND Reliability of Questionnaires: Dr. R. Venkitachalam
53 pages
10 Validity
No ratings yet
10 Validity
5 pages
Understanding Instrument Validity
No ratings yet
Understanding Instrument Validity
38 pages
BBA-BI-Class 19 Business Research Notes For BHM
No ratings yet
BBA-BI-Class 19 Business Research Notes For BHM
28 pages
Validity: Syed Hassan Shah Kargil (Ladakh)
No ratings yet
Validity: Syed Hassan Shah Kargil (Ladakh)
15 pages
Assignment: G.C. University, Faisalabad
No ratings yet
Assignment: G.C. University, Faisalabad
5 pages
Finals Validity
No ratings yet
Finals Validity
27 pages
What Is Validity in Research?
100% (1)
What Is Validity in Research?
6 pages
NOTE 5 - Validity and Data Gathering Technique
No ratings yet
NOTE 5 - Validity and Data Gathering Technique
6 pages
Different Types of Reliability
No ratings yet
Different Types of Reliability
4 pages
Validity
No ratings yet
Validity
36 pages
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
No ratings yet
Test Worthiness: Validity, Reliability, Cross-Cultural Fairness, and Practicality
7 pages
Validity
No ratings yet
Validity
6 pages
2.measurement of Validity Reliability
No ratings yet
2.measurement of Validity Reliability
31 pages
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
No ratings yet
Validity Explains How Well The Collected Data Covers The Actual Area of Investigation
7 pages
Research Validity & Reliability Guide
100% (2)
Research Validity & Reliability Guide
18 pages
Validity and Reliability
No ratings yet
Validity and Reliability
13 pages
Reliability and Validity
No ratings yet
Reliability and Validity
27 pages
PSY 210 L6 Validity
No ratings yet
PSY 210 L6 Validity
5 pages
Validity and Reliability
No ratings yet
Validity and Reliability
23 pages
Validity
No ratings yet
Validity
7 pages
Chapter Five
No ratings yet
Chapter Five
23 pages
Validity
No ratings yet
Validity
2 pages
Research Validity and Reliability
No ratings yet
Research Validity and Reliability
13 pages
Chapter 6 Validity
No ratings yet
Chapter 6 Validity
39 pages
Vii. Validity
No ratings yet
Vii. Validity
3 pages
Chapter 6 Validity
No ratings yet
Chapter 6 Validity
28 pages
PSY 210 L6 Validity
No ratings yet
PSY 210 L6 Validity
4 pages
Validity
No ratings yet
Validity
6 pages
Understanding Test Validity
No ratings yet
Understanding Test Validity
2 pages
Validity Types
No ratings yet
Validity Types
9 pages
VALIDITY and Reliability
No ratings yet
VALIDITY and Reliability
23 pages
Validity and Reliability
100% (2)
Validity and Reliability
20 pages
Validity TM
No ratings yet
Validity TM
8 pages
Understanding Types of Validity
No ratings yet
Understanding Types of Validity
7 pages
Research Chapter 05 HMM
No ratings yet
Research Chapter 05 HMM
32 pages
Week 5 - Validity
No ratings yet
Week 5 - Validity
6 pages
6 Validity
No ratings yet
6 Validity
61 pages
Constructing Instrument and Validity
No ratings yet
Constructing Instrument and Validity
9 pages
Split-Half Method
No ratings yet
Split-Half Method
14 pages
Validity: PSY 112: Psychological Assessment
No ratings yet
Validity: PSY 112: Psychological Assessment
61 pages
2.2 Validity
No ratings yet
2.2 Validity
4 pages
Validity Its Types
No ratings yet
Validity Its Types
11 pages
Validity
No ratings yet
Validity
5 pages
Epidemiology: Ensuring Reliability & Validity
No ratings yet
Epidemiology: Ensuring Reliability & Validity
36 pages
M1 - Research Instrument
No ratings yet
M1 - Research Instrument
5 pages
Khizra Yaqoob Samima Ashraf Asma Muzaffar Sana Anjum Shabana Kukeb Mphil Psychology, 1 Semester Subject: Advanced Research Methodology
No ratings yet
Khizra Yaqoob Samima Ashraf Asma Muzaffar Sana Anjum Shabana Kukeb Mphil Psychology, 1 Semester Subject: Advanced Research Methodology
18 pages
Lu4 Validity
No ratings yet
Lu4 Validity
40 pages
Phuong Et Al., 2015
No ratings yet
Phuong Et Al., 2015
57 pages
Five Factor Theory
No ratings yet
Five Factor Theory
23 pages
OPI Manual
No ratings yet
OPI Manual
296 pages
Iso 10075 3 2004
No ratings yet
Iso 10075 3 2004
11 pages
DeCuir-Gunby Et Al 2010 - Codebook For Interview Data Analysis
No ratings yet
DeCuir-Gunby Et Al 2010 - Codebook For Interview Data Analysis
21 pages
Developers Google Com Machine Learning Glossary
No ratings yet
Developers Google Com Machine Learning Glossary
85 pages
Many HR Activities Are Carried Out by Supervisors
No ratings yet
Many HR Activities Are Carried Out by Supervisors
23 pages
Collective Human Opinions in Semantic Textual Simi
No ratings yet
Collective Human Opinions in Semantic Textual Simi
17 pages
PPNCKH
No ratings yet
PPNCKH
14 pages
A Report Research Paper On The Principles of High Quality Assessment
No ratings yet
A Report Research Paper On The Principles of High Quality Assessment
12 pages
Interview-Informed Synthesized Contingency Analysis (IISCA) : Novel Interpretations and Future Directions
No ratings yet
Interview-Informed Synthesized Contingency Analysis (IISCA) : Novel Interpretations and Future Directions
10 pages
Nano - Rano
No ratings yet
Nano - Rano
11 pages
Managerial Accounting
No ratings yet
Managerial Accounting
17 pages
Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105
100% (2)
Lesson 6 Establishing Test Validity and Reliability: Learning Instructional Modules For CPE 105
17 pages
STS CRF
No ratings yet
STS CRF
38 pages
Modern Monitoring in Anesthesiology and Perioperative Care 2020
100% (2)
Modern Monitoring in Anesthesiology and Perioperative Care 2020
211 pages
2011 99
No ratings yet
2011 99
8 pages
Parental Phubbing and Child Social-Emotional Adjustment A Meta-Analysis of Studies Conducted in China
No ratings yet
Parental Phubbing and Child Social-Emotional Adjustment A Meta-Analysis of Studies Conducted in China
20 pages
Functional Assessment of Independent Living Skills
No ratings yet
Functional Assessment of Independent Living Skills
21 pages
Lake Billingsley 2000 An Analysis of Factors That Contribute To Parent School Conflict in Special Education
100% (1)
Lake Billingsley 2000 An Analysis of Factors That Contribute To Parent School Conflict in Special Education
12 pages
An Introduction To Rorschach Assessment
No ratings yet
An Introduction To Rorschach Assessment
56 pages
Using The Childrens Play Therapy Instrument CPTI
No ratings yet
Using The Childrens Play Therapy Instrument CPTI
12 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
pr2 q2 Mod2 Instrument Development Practical Research 2
No ratings yet
pr2 q2 Mod2 Instrument Development Practical Research 2
29 pages
Hanson Et Al. 2023 EXAMINING UNIVERSITY STUDENT PODCASTS
No ratings yet
Hanson Et Al. 2023 EXAMINING UNIVERSITY STUDENT PODCASTS
10 pages
Brindley (1994) - Task-Centred Assessment in Language Learning
No ratings yet
Brindley (1994) - Task-Centred Assessment in Language Learning
24 pages
Applied Nursing Research 48 (2019) 58-62
No ratings yet
Applied Nursing Research 48 (2019) 58-62
5 pages
Toward A Taxonomy of Written Feedback Messages - by Jacqui Murray, N. Ruth Gasson, Jeffrey K. Smith
No ratings yet
Toward A Taxonomy of Written Feedback Messages - by Jacqui Murray, N. Ruth Gasson, Jeffrey K. Smith
18 pages
Shaping Organizational Image-Power Through Images: Case Histories of Instagram
No ratings yet
Shaping Organizational Image-Power Through Images: Case Histories of Instagram
8 pages
Ethical Consideration-Rapanan Proposal
No ratings yet
Ethical Consideration-Rapanan Proposal
4 pages

Validity Reliability Unit 2

Uploaded by

Validity Reliability Unit 2

Uploaded by

Validity & Reliability

survey/questionnaire forms a significant

an instrument as measures what it is

discriminates from other latent variables (e.g., B, C, D).

for more variance in the observed variables associated with it than

and of the construct is questionable.

constructs that should have no relationship do, in fact, not have

 Used to assess the consistency of a measure from one time to

 Used to assess the consistency of the results of two tests

 Used to assess the degree to which different raters/observers

 Used to assess the consistency of results across items within a

items, as illustrated in the figure.

the average or mean of all these

 This approach also uses the inter-item

 In split-half reliability we randomly divide all

 Cronbach's Alpha is mathematically

 Each of the reliability estimators has certain

the same construct is measured through

You might also like