0% found this document useful (0 votes)

17 views4 pages

10 - Language Testing and Assessment

The document provides an overview of language testing and assessment, defining tests as samples of behavior used to infer language proficiency. It discusses various test types based on method (paper-and-pencil vs. performance-based) and purpose (achievement vs. proficiency), highlighting their characteristics and limitations. Additionally, it covers concepts of reliability and validity in testing, emphasizing the importance of accurate measurement and the need for reliable scoring methods.

Uploaded by

Dorottya Csikai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views4 pages

10 - Language Testing and Assessment

Uploaded by

Dorottya Csikai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

Language testing and assessment

Definition of test
 A test is a sample of an individual’s behaviour/performance on the basis of which inferences
are made about the more general underlying competence of that individual
 Language tests involve any kind of measurement/examination technique which aims at
describing the test taker’s foreign language proficiency, e.g. oral interview, listening
comprehension task, or free composition writing.
 Language test may differ in test method and purpose.

Test types based on the testing method

Paper-and-pencil language tests

 assessment of
o a separate component of the language (grammar, vocabulary)
o receptive understanding (reading, listening)
 test item: fixed response format (a number of possible responses are presented, the
candidate is required to choose (e.g.: multiple choice)
o correct answer: key; incorrect answers: distractors
o distractors are chosen based on observations of typical errors of learners
 not useful in testing the productive skills (except indirectly)

Performance-based tests
 skills are assessed in an act of communication
 assessment of speaking and writing
 the samples are elicited in context of simulations of real-word tasks in realistic contexts
 test taker is assessed by trained raters using an agreed rating process

Test types based on the purpose

Achievement tests
 associated with the process of instruction
 during or at the end of a course study
 whether and where progress has been made in terms of the goals of learning
 should support the teaching to which they relate
 possible negative effect on teaching: teaching to the test
 may be self-enclosed: it may not bear direct relationship to language use
o successful performance does not necessarily indicate successful achievement
 relate to the past: they measure what students have learned as a result of teaching
 alternative assessment: don’t teach and study for the test, involve students in assessment,
enable them to self-assess their progress

Proficiency tests
 relate to the future situation of language use
o without necessarily reference to the previous process of teaching
 based on a specification of what candidates have to be able to do in a language
 criterion: the students’ real-life language use
 include performance features, where characteristics of the criterion setting are represented
o e.g.: test of communicative abilities of a health professional: communicating with
patients
 admissions to a foreign university, occupation requiring L2 skills

The criterion:
- criterion: relevant communicative behaviour in the target situation; series of performances
subsequent to the test, the target
- Test: a performance representing samples from the criterion
- some teachers question the value of direct testing  how can you test behaviour?

Other limits to testing:

- authenticity: there is an inevitable gap between the test and the criterion
- validity: generizability; does it actually measure what it has to measure?
- Observer’s paradox

Reliability
Reliability shows how precisely we measured. The scores obtained should be very similar to those
which would have been obtained by the same students with the same ability, but at a different time.

The reliability coefficient:

 quantify the reliability of a test (between 0 and 1)
 compares the reliability of different tests
 ideal = 1 → would give the same results for a particular set of candidates regardless of when
it was administered
 it can be different for different types of language tests
o a good vocab or reading test is between .90-.99
o auditory comprehension is often .80-.89
o and oral production may be .70-.79
 it also depends also on the importance of the decisions that are to be taken on the basis of
the test
 determining → need two sets of scores for comparison
o Test-retest method: get a group of subjects to take the same test twice (problematic:
likely to recall items, learning or forgetting might takes place btw the two tests, low
motivation to take the test twice)
o Alternate forms method: use two different forms of the same test; often not
available
o Split half method: each subject given two scores; one score for on half, the other
score for the other → scores used as if the same test had been taken twice →
provides coefficient of internal consistency

The standard error of measurement and the true score

Classical test theory assumes that each person has a true score that would be obtained if there were
no errors in measurement. (by taking the same test over and over again without being affected by
circumstances → scores should vary, we could calculate average score).

Standard error of measurement (SEoM)

 based on the reliability coefficient and a measure of the spread of all the scores on the test
o SEoM = 5, candidate scores 56 → his/her true score lies btw 51-61
 statements based on what is known about the pattern of scores that would occur if it were
possible to take the test over and over again
 Item Response Theory (IRT): estimate how far an individual test taker’s actual score is likely to
diverge from their true score; estimate for each individual, based on their performance on
each of the items on the test
 standard error of measurement serves to remind us that in the case of some individuals
there is quite possibly a large discrepancy btw actual score and true score
Reliability cannot be estimated directly since that would require one to know the true scores, which
according to classical test theory is impossible.

Scorer reliability
Ideally the same scorer should give the same scores regardless of the circumstances, and this would
be the same score as would be given by any other scorer on any occasion
 Scorer reliability coefficient → quantifies the level of agreement given by the same or
different scorers on different occasions
o scorer reliability coefficient of a multiple choice test: 1 (requires no judgement)
o Interview → a degree of judgement is called for on the part of the scorer, perfect
consistency is not to be expected

How to make tests more reliable

 Take enough samples of behaviour (the more items you have on a test, the more reliable the
test will be)
o each additional item should represent a fresh start → gain additional information
 Exclude items which do not discriminate well between weaker and stronger students as they
contribute little to the reliability of a test (they are either too easy or too difficult)
 Do not allow too much freedom in answering → depressing effect on reliability
 Write unambiguous items
 Provide clear and explicit instructions, both in written and oral tasks
 Ensure that tests are well laid out and perfectly legible
 Make candidates familiar with format and design techniques
 Provide uniform and non-distracting conditions of administration
 Use items that permit scoring which is as objective as possible (multiple choice, open-ended
fill in the gap type of question with one-word answers)
 Make comparisons between candidates as direct as possible (provide 2 composition items
instead of 6)
 Provide a detailed scoring key
 Train scorers
 Agree acceptable responses and appropriate scores at outset of scoring
 Identify candidates by number, not by name
 Have multiple, independent scorers

Reliability and validity

Reliability is a more accurate way of describing precision, while validity is a more precise way of
describing accuracy.
An example often used to illustrate the difference between reliability and validity in the experimental
sciences involves a common bathroom scale. If someone who is 200 pounds steps on a scale 10 times
and gets readings of 15, 250, 95, 140, etc., the scale is not reliable. If the scale consistently reads
"150", then it is reliable, but not valid. If it reads "200" each time, then the measurement is both
reliable and valid.

Testing Ch.10
No ratings yet
Testing Ch.10
34 pages
اختبارات المرحلة الرابعة د. ضياء مزهر قسم اللغة الانكليزية
No ratings yet
اختبارات المرحلة الرابعة د. ضياء مزهر قسم اللغة الانكليزية
48 pages
Principles of Language Testing
No ratings yet
Principles of Language Testing
48 pages
Language Testing PPT 2
No ratings yet
Language Testing PPT 2
27 pages
Week V & VI
No ratings yet
Week V & VI
77 pages
Chapter 2 Principles of Language Assessment-Handout
No ratings yet
Chapter 2 Principles of Language Assessment-Handout
46 pages
Vizsgaanyag PDF
No ratings yet
Vizsgaanyag PDF
30 pages
Language Testing: Open University of Sudan
100% (2)
Language Testing: Open University of Sudan
33 pages
Language Testing Language Assessment and Error Correction
No ratings yet
Language Testing Language Assessment and Error Correction
5 pages
Đề cương KTDG
No ratings yet
Đề cương KTDG
13 pages
Language - Testing - Characteristics of Good Test
No ratings yet
Language - Testing - Characteristics of Good Test
31 pages
The Principles of Language Assessment"
No ratings yet
The Principles of Language Assessment"
13 pages
Task 1b. The Principles of Language Testing
100% (2)
Task 1b. The Principles of Language Testing
9 pages
Lecture3 - TESTING AND EVALUATION PDF
No ratings yet
Lecture3 - TESTING AND EVALUATION PDF
45 pages
Testing and Evaluation
100% (4)
Testing and Evaluation
11 pages
Applying Principles of Assessment and Testing
No ratings yet
Applying Principles of Assessment and Testing
7 pages
Language Testing
No ratings yet
Language Testing
29 pages
Meeting 3 - Principles of Language Assessment
No ratings yet
Meeting 3 - Principles of Language Assessment
53 pages
Validity and Reliability: Joanna Ochoa V
No ratings yet
Validity and Reliability: Joanna Ochoa V
15 pages
Principles of Language Testing
No ratings yet
Principles of Language Testing
22 pages
Midterm
No ratings yet
Midterm
3 pages
Testing U02-Standardized Testing
No ratings yet
Testing U02-Standardized Testing
10 pages
Principles of High Quality Assessment
No ratings yet
Principles of High Quality Assessment
38 pages
Task 1B (I)
No ratings yet
Task 1B (I)
5 pages
Test Reliability in Education
No ratings yet
Test Reliability in Education
95 pages
Group 6 - Issues in Foreign Language Testing
No ratings yet
Group 6 - Issues in Foreign Language Testing
131 pages
Chapter5 Reliability
No ratings yet
Chapter5 Reliability
25 pages
Testing and Evaluation in ELT
No ratings yet
Testing and Evaluation in ELT
27 pages
El 114 Prelim Module 2
No ratings yet
El 114 Prelim Module 2
9 pages
Characteristic of A Good Language Test
No ratings yet
Characteristic of A Good Language Test
4 pages
U3 - Characteristic of A Good Test
No ratings yet
U3 - Characteristic of A Good Test
6 pages
Common Test Techniques
100% (2)
Common Test Techniques
4 pages
Applied Linguistics and Language Testing
No ratings yet
Applied Linguistics and Language Testing
5 pages
LG Testing
No ratings yet
LG Testing
18 pages
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
No ratings yet
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
8 pages
Language Test Qualities & Assessment
No ratings yet
Language Test Qualities & Assessment
21 pages
Principles of High Quality Assessment 2
No ratings yet
Principles of High Quality Assessment 2
46 pages
3charestris of A Good Test
No ratings yet
3charestris of A Good Test
39 pages
Assessment and Testing Lecture
No ratings yet
Assessment and Testing Lecture
4 pages
Trixielyn Kate N. Roxas - Improving Assessment Items
No ratings yet
Trixielyn Kate N. Roxas - Improving Assessment Items
28 pages
The Five Principles of Assessment
80% (5)
The Five Principles of Assessment
10 pages
Week 4 & 5 - Principles of Language Assessment
No ratings yet
Week 4 & 5 - Principles of Language Assessment
35 pages
On Thi
No ratings yet
On Thi
43 pages
TESTING
No ratings yet
TESTING
3 pages
Standardized and Non Standardized Test
No ratings yet
Standardized and Non Standardized Test
23 pages
Qualities of A Good Test
No ratings yet
Qualities of A Good Test
4 pages
اختبارات لغوية
No ratings yet
اختبارات لغوية
19 pages
Unit 2 Principles of Language Assessment
No ratings yet
Unit 2 Principles of Language Assessment
23 pages
Xtics of Good Test BI
No ratings yet
Xtics of Good Test BI
22 pages
MÓDSZERTAN Angol 7. Tétel
No ratings yet
MÓDSZERTAN Angol 7. Tétel
8 pages
Language Testing Concepts Explained
No ratings yet
Language Testing Concepts Explained
4 pages
Lecture 3 Handout Testing - Assessment
No ratings yet
Lecture 3 Handout Testing - Assessment
3 pages
Assessment & Evaluation Guide
No ratings yet
Assessment & Evaluation Guide
4 pages
Test - Education (1) STANDARDIZED TESTS
No ratings yet
Test - Education (1) STANDARDIZED TESTS
9 pages
Language Testing Essentials
No ratings yet
Language Testing Essentials
11 pages
Constructing Effective Test Items
No ratings yet
Constructing Effective Test Items
22 pages
Language Testing and Assessment, 2018-2019
No ratings yet
Language Testing and Assessment, 2018-2019
79 pages
Characteristics of Assessment Methods
No ratings yet
Characteristics of Assessment Methods
15 pages
Outline Testing and Assessment
No ratings yet
Outline Testing and Assessment
9 pages
Ba Hons Dissertation Examples
100% (2)
Ba Hons Dissertation Examples
7 pages
MGT201 TermPaper AUTUMN2023
No ratings yet
MGT201 TermPaper AUTUMN2023
2 pages
Applied and Computational Complex Analysis Vol 2 - P. Henrici
No ratings yet
Applied and Computational Complex Analysis Vol 2 - P. Henrici
337 pages
Preprints201712 0135 v1
No ratings yet
Preprints201712 0135 v1
16 pages
The Anatomy of Corporate Fraud
No ratings yet
The Anatomy of Corporate Fraud
24 pages
Timogan - Retraction Paper
No ratings yet
Timogan - Retraction Paper
2 pages
KOgan Et Al.2017. Technological Innovation, Resource Allocation, and Growth
No ratings yet
KOgan Et Al.2017. Technological Innovation, Resource Allocation, and Growth
48 pages
Project Report On Sanvie Retail Private Limited
No ratings yet
Project Report On Sanvie Retail Private Limited
65 pages
Book Launch: The Factory-Free Economy
No ratings yet
Book Launch: The Factory-Free Economy
1 page
Górecka & Szaluka (2013) Country Market Selection in International Expansion - Multicriteria
No ratings yet
Górecka & Szaluka (2013) Country Market Selection in International Expansion - Multicriteria
26 pages
Consumer Surplus - Case of Uber
No ratings yet
Consumer Surplus - Case of Uber
42 pages
IAS Books
100% (1)
IAS Books
2 pages
Research Methodology
No ratings yet
Research Methodology
17 pages
Students' Views On Modular Learning of The Grade-7 Students in Siay National High School
100% (8)
Students' Views On Modular Learning of The Grade-7 Students in Siay National High School
15 pages
Binomial Probability Sums: Table A.1
No ratings yet
Binomial Probability Sums: Table A.1
23 pages
Real-Time Specific Energy Monitoring Enhances The Understanding of When To Pull Worn PDC Bits
No ratings yet
Real-Time Specific Energy Monitoring Enhances The Understanding of When To Pull Worn PDC Bits
10 pages
SPP 200-203 Compiled
No ratings yet
SPP 200-203 Compiled
37 pages
Understanding The Dimensions of Trust in Public Relations and The
No ratings yet
Understanding The Dimensions of Trust in Public Relations and The
260 pages
Geo - PPM - Nevada Dot
No ratings yet
Geo - PPM - Nevada Dot
309 pages
Acupoint Sensitization, Acupuncture Analgesia, Acupucnture On Visceral Functional Disorders and Its Mechanism
100% (1)
Acupoint Sensitization, Acupuncture Analgesia, Acupucnture On Visceral Functional Disorders and Its Mechanism
138 pages
Presentation On Bancassurance: Presented By: Hiranya Vaidya M090700032
No ratings yet
Presentation On Bancassurance: Presented By: Hiranya Vaidya M090700032
20 pages
Section V Notes With Answers - PDF B
No ratings yet
Section V Notes With Answers - PDF B
8 pages
EDUC 214 - Mesa, Grophel L. (Educational Planning Insights)
No ratings yet
EDUC 214 - Mesa, Grophel L. (Educational Planning Insights)
4 pages
Teaching Plan Geomatics Engineering Class
No ratings yet
Teaching Plan Geomatics Engineering Class
9 pages
Human Resource Management Gaining A Competitive Advantage
No ratings yet
Human Resource Management Gaining A Competitive Advantage
24 pages
Opcrf Movs Checklist Sy 2022 2023
No ratings yet
Opcrf Movs Checklist Sy 2022 2023
9 pages
Overcoming Inertia: Rosenbauer's Tech Adaptation
No ratings yet
Overcoming Inertia: Rosenbauer's Tech Adaptation
75 pages
Information: European Foundation For Quality Management Efqm
No ratings yet
Information: European Foundation For Quality Management Efqm
17 pages
The Effect The Type of Surface Has On A Bouncy Balls Return Height
No ratings yet
The Effect The Type of Surface Has On A Bouncy Balls Return Height
7 pages
Shell Structures: 1 What Is A Shell?
100% (1)
Shell Structures: 1 What Is A Shell?
32 pages

10 - Language Testing and Assessment

Uploaded by

10 - Language Testing and Assessment

Uploaded by

1.

Language testing and assessment

Test types based on the testing method

Paper-and-pencil language tests

Test types based on the purpose

Other limits to testing:

The reliability coefficient:

The standard error of measurement and the true score

Standard error of measurement (SEoM)

How to make tests more reliable

Reliability and validity

You might also like