Assessment
Definition
•Assessment is a systematic approach to
collecting information and making
inferences about the ability of a student or
the quality or success of a teaching course
on the basis of various sources of
evidence.
• Assessment may be done by test, interview,
questionnaire, observation, etc.
• Students may be tested at the beginning and
again at the end of a course to assess the quality
of the teaching on the course.
•The term “testing” is often associated with large-
scale standardized tests, whereas the term
“assessment” is used in a much wider sense to
mean a variety of approaches in testing and
assessment.
• Assessment of student learning is a process that:
• Provides data/information you need on your
students’ learning.
• Engages you and others in analyzing and using
this data/information to confirm and improve
teaching and learning.
• Produces evidence that students are learning the
outcomes you intended.
• Guides you in making educational and institutional
improvements.
• Evaluates whether changes made improve/impact
student learning.
• Documents the learning and your efforts.
• There two types of assessment:
1-Formative assessment Or Assessment During
Instruction:
• It is assessment during the course of instruction rather
than after it is completed.
• Its emphasis is on assessment for learning rather than
assessment of learning.
• A formative test is a test that is given during a
course of instruction and that informs both the
student and the teacher how well the student is
doing.
• A formative test shows whether the student
needs extra work or attention.
• This ongoing observation and monitoring of students’
learning while you teach informs you about what to do
next.
• Formative assessment helps you set your teaching at a
level that challenges students and stretches their
thinking. It also helps you to detect which students
need your individual attention.
The importance of feedback
• Providing effective feedback is an essential aspect of
formative assessment and has always been an integral
aspect of good teaching.
• The idea is to not only continually assess students as they
learn but to provide informative feedback so that students’
focus is appropriate.
• The important aspects of feedback in formative
assessment is that it should be immediate, specific,
and individualized.
2- Summative Assessment: or Post-instruction
Assessment:
• It is assessment after instruction is finished, with the
purpose of documenting student performance.
• A summative test is a test given at the end of a course
of instruction, that measures or “sums up” how much a
student has learned from the course. A summative test
is usually a graded test, i.e. it is marked according to a
scale or set of grades.
• Summative assessment provides information
about:
- How well your students have mastered the
material.
- Whether students are ready for the next unit.
- What grades they should be given.
- What comments you should make to parents.
- How you should adapt your instruction.
Assessment and motivation
• Assessments that are challenging but fair
should increase students’ enthusiasm for learning.
• Assessments that are too difficult will lower students’ self-esteem and
self-efficacy, as well as raise their anxiety.
• Assessing students with measures that are too easy will bore them
and not motivate them to study hard enough.
Characteristics of high quality assessment
1- Validity:
• The degree to which a test measures what it is supposed to measure,
or can be used successfully for the purposes for which it is
intended.
• The most important source of information for validity
in the classroom is content-related
evidence, or the extent to which the assessment
reflects what you have been teaching.
2- Reliability:
is the extent to which a test produces consistent, reproducible scores.
• A test is said to be reliable if it gives the same results when it is given
on different occasions or when it is used by different people.
3- Fairness:
• Assessment is fair when all students have an equal
opportunity to learn and demonstrate their knowledge and
skill.
• Assessment is fair when it reflects the learning targets,
content, and instruction.
• Assessment bias includes offensiveness and unfair
penalization.
• Assessment is offensive to a subgroup of students when
negative stereotypes of that subgroup are included in the
test.
• An assessment also may be biased if it unfairly
penalizes a student based on the student’s group
membership, such as ethnicity, socioeconomic status,
gender, religion, and disability.
• 4. Practicality:
• Practicality refers to the logistical, practical, and
administrative issues involved in assessment.
• It refers to the extent to which the demands of test
specifications can be met within the limits of existing
resources such as human resources, material
resources and time.
• 5. Authenticity:
• Authentic assessment uses a format that is consistent with
how ability is evaluated in the real-world and evaluates skills
and abilities that have value and meaning outside of the
classroom or on the job.
• Authentic assessment involves those activities or tasks that
people actually do in the real-world. Indeed, authentic is
often treated as a synonym for realistic.
Norm-referenced vs Criterion-referenced
assessment:
• Norm-referenced: an assessment can be used to compare students’
performance with their peers.
• Criterion-referenced: an assessment can be used to compare
students’ performance with the course content.
Current trends in assessment
Using at least some performance-based assessment:
Performance assessments require students to create
answers or products that demonstrate their knowledge or
skill.
• Examples of performance assessment include writing an essay,
conducting an experiment, carrying out a project, solving a real-world
problem, and creating a portfolio.
TRADITIONAL TESTS
• They are basically paper-and-pencil tests.
• Traditional modes of assessment are thought not to capture important
information about test takers’ abilities in L2 and are also not thought
to reflect real-life conditions.
Traditional Assessment
• Became very popular in schools in the 80s when grades were based
almost entirely on results of completing multiple choice items on
classroom tests that were similar to standardized tests.
• Research shows in some cases that there is very little relationship
between such test grades, and any measure of students' ability to
apply knowledge.
• These types of tests emphasize memory over more complex skills
with a focus on grades and ranking.
• The formats of these items tend to be very similar. The conditions
under which the tests are given are also standardized (amount of
time, no talking).
I- Selected-Response Items:
• Selected-response items have an objective format that allows
students’ responses to be scored quickly.
• A scoring key for correct responses is created and can be
applied by an examiner or by a computer.
• Multiple-choice, true/false, and matching items are the most
widely used in this type of tests.
Strengths:
• Both simple and complex learning outcomes can be
measured.
• The task is highly structured and clear.
• A broad sample of achievement can be measured.
• Scoring is easy, objective, and reliable.
Limitations
• Constructing good items is time consuming.
• It is frequently difficult to find plausible distractors.
• This format is ineffective for measuring some types of problem
solving and the ability to organize and express ideas.
• Scores are more influenced by guessing (MTQ)
II- CONSTRUCTED-RESPONSE ITEMS
• Constructed-response items require students to write out information
rather than select a response.
• In scoring, many constructed-response items require judgment on the
part of the examiner.
• E.g. short answer items, essays,
Strengths:
• The highest level of learning outcomes (analysis, synthesis,
evaluation) can be measured.
• The integration and application of ideas can be emphasized.
Limitations:
• Achievement may not be adequately sampled due to the time
needed to answer each question.
• Scores are raised by writing skill and lowered by poor
handwriting, misspelling, and grammatical errors.
• Scoring is time consuming, subjective, and possibly unreliable.
Alternative assessment
• Alternative assessment refers to various types of assessment
procedures that are seen as alternatives or complements to
traditional testing.
• Procedures used in alternative assessment include self-
assessment, peer assessment, portfolios, learner diaries or
journals, student–teacher conferences, interviews, and
observation.
• It is performance based so sometimes called “performance”
assessment.
• Uses activities that reveal what students can do, emphasizing their
strengths instead of their weaknesses.
• Alternative assessment instruments are by necessity designed and
structured differently and are graded and scored differently.
• Moving from traditional assessment with objective tests to alternative
or performance assessment has been described as going from
“knowing” to “showing”.
• Works well in learner-centred classrooms since they are based on the
idea that students can evaluate their own learning and learn from the
evaluation process.
• A detailed scoring rubric is essential: evaluation criteria and standards
are known to the student.
• Involves interaction between the assessor (instructor/peers/self) and
the person being assessed.
• One current trend is to require students to solve some type of
authentic problem or to perform in terms of completing a project or
demonstrating other skills outside the context of a test or an essay.
• Another trend is to have students create a learning portfolio to
demonstrate what they have learned.
• Built around topics or issues of interest to the students.
• Replicates real world communication contexts and situations.
• Involves multi-stage tasks and real problems that require creative use
of language rather than simple repetition.
• Requires learners to produce a quality product or performance.
Characteristics of alternative /performance
assessment
• Performance assessments often include an emphasis on “doing”
open-ended activities for which there is no correct, objective answer
and that may assess higher-level thinking.
• Performance assessment tasks sometimes are realistic.
• Performance assessments are designed to evaluate what students
know and can do.
• It provides a context for evaluating students’ higher-level thinking
skills, such as the ability to think deeply about an issue or a topic.
• Performance assessments often take considerably more time to
construct, administer, and score than objective tests.
• Difficult to assign a score to a single student performance (relative to
marking a multiple choice test for example)
• Portfolios can be time consuming to evaluate for a teacher.
• Comprehensive set of scoring guidelines (rubrics) must be developed
in order to accurately judge completeness and quality (also time
consuming).
• Lack of use in traditional educational institutions (universities).
• Most instructors/teachers have no training to develop these types of
assessments.
• Difficult to develop and grade
In summary
• It is difficult to dismiss traditional assessment entirely.
• Ideally a combination of both. That way a teacher can address many
learning styles, yet still prepares students for a university or exam
experience.
Washback
• Washback refers to the influence of language testing on teaching and
learning.
• The test determines the activities that occur in the classroom.
Who or what might be affected:
• Teaching
• Learning
• Content
• Rate of learning
• Sequence of teaching/learning
• Degree/depth of curriculum coverage
• Attitudes of teachers/learners
• Negative effects:
• Restriction of content – narrowing of curriculum
• Too much time practising for the test
• Positive effects:
• Transparent objectives and outcomes
• Increased motivation of learners
• Increased accountability of teachers