DEVELOPMENT
OF
ASSESSMENT
TOOLS
by
Nilo M. Eder, Ph. D
Standards for and
Characteristics of Good
Assessment Instruments
Standards in Evaluation
Why the need for standards?
1. The performance that the student must
meet is the standard of acceptability
specified in the instructional objective;
i.e. it is the basis for saying that students
cannot proceed to new instructional
objectives and materials
Standards in Evaluation
Why need for standards?
2.All students must meet the standard of
acceptability not in a single objective but
in all objectives
3. The use of specified standards alters
some conventional procedures in test
construction.
Types of Standards
Norm-referenced standard
Norm-referenced test finds a student performance
level in relation to levels of others on the same test
Criterion-referenced standard
The purpose is to know what the student can do
rather than how he compares with others
Individual mastery is the primary issue
Objective-referenced standard
Items are constructed to be direct reflection to a
specified objective
Other Types of Standards
Absolute Standard
Based on mastery or perfect student
performance and the only acceptable
performance level is 100%
Relative Standard
Itcompares the performance of one student
with that of the other students in his group
Characteristics of a Good Test
Validity
Criterion-related/empirical validation
Predictive validity
Concurrent validity
Content validation
Refer to the table of specifications
Construct validation
A construct is a mental operation (e.g. loyalty, dominance,)
Reliability
Test-retest method
Equivalent forms
Split-half
Characteristics of a Good Test
Objectivity
free from subjective judgments
Efficiency
One that yields a large number of
responses per unit of time
Teacher-Made
Tests
Planning the Test
Principal Objectives (in developing a
framework for tests):
Our test should assess all essential
objectives in our instruction
The test content should appropriately reflect
the relative emphasis given various topics in
the course
Our tests should allow us to apply the test
results in making appropriate decisions
Rules in Planning a Test
Analyze the instructional objectives of the
course
Make adequate provision for evaluating all
important outcomes of instruction
The test should reflect the approximate
typical emphasis of the course
The nature of the test should reflect its
purpose
The nature of the test must reflect the
conditions under which it will be administered
Steps in planning a test
Identify test objectives/lesson outcomes
Decide on the type of test to be
prepared
Prepare table of specifications (TOS)
Construct the draft test items
Try-out and validate
Sample Table of Specification
for a 50-item test
Course No. of Percent No. of
Content recitation Items
periods
Content 1 10 25 12
Content 2 15 37.5 19
Content 3 8 20 10
Content 4 7 17.5 9
Total 40 100 50
Sample Table of Specification for a
50-item test
Item number with emphasis on
Competency
Total
Remember Understan Applying Higher
ing ding level
Competency 1 1,2 11,13 18, 19 26,27,28, 12
29,30,31
Competency 2 3,4,5,6 12,14 20,21 32-42 19
Competency 3 7, 8 15, 16 22,23 43-46 10
Competency 4 9,10 17 24,25 47-50 9
Total 10 7 8 25 50
General Rules in Test
Construction
Avoid replication of the textbook
The test item should be aimed at a specific
objective
Begin writing test items well ahead of the
time when they will be used, and allow time
for test revision
Consider the difficulty level of the item in
relation to the purpose of testing.
Do not allow items to be interdependent
Teacher-Made
Test
Objective Essay
Supply-Type Fixed Response
Alternate-
Matching-Type Multiple-Choice
Response
Yes-No True-False Right-Wrong
Figure 4. Classifications of teacher-made test
True-False Test
Advantages
Applicability to a wide range of subject matter
Adaptability for use in situations where the
measurement of acquisition of factual, non-
interpretive information is desired
Objectivity in scoring
Ease of administration
Wide sampling of knowledge tested per unit time,
thus it has greater efficiency
True-False Test
Limitations/Weaknesses
Negative suggestion effect
Guessing factor
Low discriminating power
Rarely reliable as the multiple-choice test
Difficulty of framing absolutely true or absolutely
false statements
Seldom applicable to measurement of complex
understanding and higher order mental process
True-False Test
Variations
Underlining a word or clause in a statement
Requiring examinees to correct false statements
Basing true-false items on specific stimulus
material provided for the student
Cluster variety
A number of true-false statements are constructed
around a broad area
Controlled correction type
Cluster Variety
Directions: The statement below pertains to the
location of various provinces in relation to
Iloilo. If the statement is correct, write TRUE,
otherwise, write FALSE.
1. The province of Iloilo is bounded by the
following:
a. the province of Antique on the North.
b. the province of Capiz on the West.
c. China Sea on the East.
Rules in Constructing
True-False Test
1. Avoid specific determiners (e.g. all, always,
never, generally)
2. Avoid disproportionate number of either
true or false statements
3. Avoid the exact wording from the textbook
4. Avoid trick statements
5. Limit each statement to the exact point to
be tested
6. Avoid double negatives
Rules in Constructing
True-False Test
7. Avoid ambiguous statements
Example: Science education in the Philippines is
still backward.
8. Avoid unfamiliar, figurative, or literary
language
9. Avoid long statements or complex sentence
structure
10. Avoid qualitative language whenever
possible (e.g. few, many, large, old)
Rules in Constructing
True-False Test
11. Commands cannot be true or false
12. If the statement is to test the truth or falsity
of the reason, the main clause should be
true, and the reason either true or false.
Ex. 1. As it ages, pure copper turns brown
(false), because it oxidizes (true).
2. As it ages, pure copper turns green (true),
because it oxidizes (true)
3. As it ages, pure copper turns green (true),
because it attracts green algae (false).
Rules in Constructing
True-False Tests
13. Do not establish a pattern for the answers.
14. Require the simplest possible method of
indicating the response.
15. Arrange the statements in groups.
16. Use true-false only for points that lend
themselves unambiguously to this kind of
item – meaning, do not have an exclusive
true-false test.
17. Inform students if a correction for guessing
will be applied.
Writing Directions for True-
False Test
The directions should include the
following items:
How they will respond to the item
Where they will write their answers
Whether a correction for guessing will
be applied
Multiple-Choice Test
Parts:
Stem
Choices or alternatives
Distractors/decoys or foils
Keyed response
Multiple-Choice Test:
Weaknesses
Needs thorough knowledge of the course
content , an awareness of the methodology of
item writing, skill in the use of language, and
thorough knowledge of the level of development
of the students
Needs time to write good multiple-choice items
Cannot be used to measure the student’s ability
to organize or to clearly express his/her answers
according to acceptable language usage rules
Multiple-Choice Test:
Strengths
Most flexible and versatile of all selection-type
test
Can measure objectives at all levels of cognition
Adaptable to all subject matter and grade levels
High efficiency- a large number of items can be
answered in a normal examination period
Can be scored accurately, rapidly and
objectively even by others who are not qualified
to teach the subject
Methods of Designing Multiple-
Choice type
According to answer required
Best answer type
Correct answer type
Based on the manner the stem is
formulated
Question form
Declarative statement
Rules for Writing Multiple-Choice
Items
1. The stem should contain the problem, the
central issue of the item, or the frame of
reference when selecting the correct response
2. Arrange choices in chronological order, in
series of magnitudes, alphabetically, etc.
3. Make all distractors plausible and attractive
responses to the item
Rules in Writing Multiple-Choice
Items
4. Do not make the correct answer obvious
by making it unnecessarily different from
the other choices.
5. All alternatives of a given item should be
homogeneous in content, form and
grammatical structure.
6. Write at least four choices per item or
maybe five
Rules in Writing Multiple-Choice
Test
7. “None of these” and “all of these” should be
used with care.
8. In a best answer type, make sure that one and
only one is clearly the best answer.
9. Express the responses to a multiple-choice test
item so that grammatical consistency is
maintained.
10. Avoid double negatives.
11. Make sure that the complete item is on the
same page.
Varieties of Multiple-Choice Type
Allowing more than one correct response
to an item
Incomplete-response variety
Combined-response variety
Degree-of-certainty variety
Choices listed on top, and the items
becomes a series of statements
Matching Type
It is a specialized form of multiple-choice
type which occur in clusters.
A cluster of matching-type test must contain:
– An introductory statement
– A set of related stems called premises
– A list of alternates to be shared by all the
premises in the cluster
Strengths of Matching-Type Test
Can be efficiently and effectively used to
measure lower levels of the cognitive domain
and may also be used for higher level of
cognition (however, difficult to develop)
Can be scored rapidly, accurately and
objectively
Weaknesses of Matching-Type Test
It is difficult to develop matching-type item that can
measure higher levels of cognitive domain
It is often difficult to find enough important and
homogeneous ideas to form the premises of the item
The construction of a homogeneous set of matching-
types items often places an overemphasis on a
rather small portion of the content area to be tested.
Rules in Constructing the Introductory Statement
(Matching Type)
It must set a general frame of reference for
responding to the items in the cluster.
It must clearly indicate to the examinee how he is to
proceed in selecting his responses.
It must tell the examinee where he will record his
responses.
It must inform the examinee on whether or not an
answer could be given more than once.
Rules in Constructing the Premise
1. It should specify in more detail the frame of
reference suggested in the introductory statement.
2. It should present the specific problem to be solved.
3. It should be expressed clearly and concisely.
4. Errors of language usage should be avoided
Rules in Constructing the Premise
5. All superfluous and unnecessarily difficult words
should be eliminated.
6. Highly technical terms should be excluded unless
they are essential to the concept being measured.
7. Homogeneity of premises should be strictly
followed.
8. The number of premises in the cluster should rarely
exceed six or seven.
9. The total cluster – the introductory statement,
premises and alternates – must appear on the same
page of the test
Rules in Constructing the Alternates
Each of the alternates must be grammatically
appropriate to each premise of the cluster.
They should all be equally appealing.
The number of alternates should be greater
than the number of premises, unless the
alternate can be used more than once.
Varieties of Matching Type Test
Imperfect matching type – an alternate can
be used more than once
Multi-matching variety – three or more
columns
Simple Recall Type
(Supply Type)
Advantages:
Minimizes guessing
Measures retention of specific points and demands
accurate information
Can measure higher level of cognitive skills
Limitations:
May adversely affect study habits – focus on
memorization of facts
Scoring is not as objective as the other objective test
It takes longer time to answer
Rules in Constructing Supply Type
Test
1. Avoid indefinite statements
2. Do not over mutilate your statements
3. Omit only the key words or phrases
rather than the trivial ones.
4. Make the blanks of uniform length.
5. Place the blanks near the end of the
statement rather than at the beginning.
Rules in Constructing Supply Type
Test
6. Avoid the use of extraneous hints designed to
help the student identify the correct answer.
7. Always indicate the units in which the answer is
to be expressed for those that could have
several possible answers
8. Avoid lifting directly from the textbook.
9. Avoid grammatical clues to the correct answer.
Essay Test
Its most important feature is the
freedom of response
Types:
Extended-response type
Restricted-response type
Before constructing Essay test,
ask the following:
How do we relate the essay item to one or
more cells of the table of specifications?
How can we adequately sample the pertinent
subject matter?
How do we adapt the essay item to fit the
academic background of the students?
How do we determine the amount of freedom
of response to be allowed?
How do we establish a suitable time allotment
for the student’s response?
General Considerations in
preparing good essay test
1. Give adequate time and thought to
the preparation of essay questions.
2. The question should be written so that
it will elicit the type of behavior you
want to measure.
3. Establish a framework within which
the student will operate when he
answers the question
General considerations in
preparing good essay tests
4. Decide in advance what factors will be
included in an essay response.
5. Do not provide optional question in an
essay test.
6. Adapt the length of responses and the
complexity of the question and answer
to the maturity of the students.
7. Prepare a scoring key.
Rules in constructing Essay
Test Items
DON’TS:
1. Don’t begin an essay question with
“discuss” when the question fails to
provide a basis for or the elimination
or the focus of the discussion.
Example:
Discuss the general body plan of animals.
DON’TS:
2. Don’t ask for an expression of opinion
when your intent is to measure student
learning or the ability to present
evidence for or against.
Example:
What do you think of the present
administration?
DON’TS
3. Don’t introduce essay questions with “write
all you know about,” “in your opinion,” “what
do you think,” and so on.
4. Don’t ask a comparison without clearly
specifying the basis or bases of which the
comparison will be made.
Example:
Compare gases, liquids and solids.
DO’S:
1. Do use content or organizational
requirements which demand that the
student use the material covered
rather than merely reproduce it.
Example:
What are the characteristics of gases?
DO’S:
2. When dealing with controversial issues
ask for and evaluate the presentation of
evidence for or against a position and
not the position itself.
Example 1 (poor)
What is your position regarding the use
of Filipino in teaching the sciences?
DO’S:
Example 2: (improved)
A DepEd order has set the stage for the use of
Filipino in the teaching of the sciences. How would
you react to this order based on:
a. the readiness of the Philippine educational system
to use Filipino.
b. possible effects of this shift in the use of a
medium of instruction on the students.
c. chances of attracting foreign nationals to study in
the Philippines.
Scoring the Essay Test
1. Score only one question at a time for all
papers.
2. Try to score all responses to a particular
question without interruption.
3. Score the papers anonymously.
4. Score only on the factor(s) you decided will
be considered.
5. Decide on the scoring system and use it
consistently.
6. Score the papers yourself
Improving Test Items
Item analysis
- Level of difficulty
– Item-discriminating power
Sample Rubric for Assessing
Projects/Outputs
Levels Points Indicators
Exemplary 6 Work/project is exceptional and impressive. A distinctive and
sophisticated application of knowledge and skills are evident.
Strong 5 Work/project exceeds the standard; thorough and effective
application of knowledge and skills are evident.
Proficient 4 Work/project meets the standard; acceptable and it displays
the application of essential knowledge and skills.
Developing 3 Work/project does not yet meet the standard; show basic but
inconsistent application of knowledge and skills; work needs
Emerging 2 Work/project shows partial application of knowledge and
skills; lacks depth or incomplete and needs considerable
development; errors and omissions are present.
Learning 1 No work presented
Rubric for Assessing Assignments
A. Content
Points Indicators
5 81-100% of the task required is correctly answered with supporting
evidences/explanations.
4 61-80% of the task required is correctly answered with supporting
evidences/explanations
3 41-60% of the task required is correctly answered with supporting
evidences/explanations
2 21-40% of the task required is correctly answered with supporting
evidences/explanations
1 1-20% of the task required is correctly answered with supporting
evidences/explanations
0 No correct answer or no task was accomplished
Rubric for Assessing Assignments
B. Organization
Points Indicators
5 Answers/ideas are clearly stated and in logical sequence
4 Answers/ideas are clearly stated but not more than 25% errors in
sequencing
3 Up to 75% of the answers/ideas are clearly stated and with not more
than 50% errors in sequencing
2 Up to 50% of the answers/ideas are clearly stated and with not more
than 75% errors in sequencing
1 Less than 50% of the answers are clearly stated and with major
errors in sequencing
0 No answer at all
Rubric for Assessing Assignments
C. Sources of Information (Optional)
Points Indicators
5 With 9-10 sources of information
4 With 7-8 sources of information
3 With 5-6 sources of information
2 With 2-3 sources of information
1 With 1-2 sources of information
0 None at all
Sample Rubric for Assessing
Participation (English)
Ratings Criteria
90-100 Clear language – Demonstrates a complete understanding of
subjects matter. Language is precise and varied. Speak in clear,
correct English appropriate to the situation
80-89 Clear language. Demonstrate a good understanding of the key
concepts, but explanations could be more detailed. Adequate
vocabulary is used fairly.
70-79 Adequate oral skills but sometimes indicate confused thinking
about a concept.
60-69 Language vocabulary is marginal. Inadequate and incomplete
explanation, indicating poor understanding of the subject matter.
Sample rubric for Assessing
Participation (Group)
Points Criteria
Understanding of concept Teamwork & Cooperation
5 5 correct answers within the time limit All members of the group
participated
4 4 correct answers within the time limit; 4 out of 5 members participated
5 correct answers beyond the time limit actively
3 2-3 correct answers within the time limit; 3 out of 5 members of the group
3-4 correct answers beyond the time participated actively
limit
2 0-1 correct answer within the time limit; 2 out of 5 members of the group
1-2 correct answers beyond the time participated actively
limit
1 No correct answer beyond the time limit Only the leader performed the
task
THANK YOU
FOR LISTENING!