Languages
Languages
Article
Developing a Technology-Based Classroom Assessment of
Academic Reading Skills for English Language Learners and
Teachers: Validity Evidence for Formative Use
Mikyung Kim Wolf * and Alexis A. Lopez
Abstract: In U.S. K-12 schools, adequate education of English language learner (EL) students, partic-
ularly to support their attainment of English language and literacy skills, has attracted heightened
attention. The increased academic rigor as well as sophisticated disciplinary language demands
embodied in current academic content standards have posed considerable challenges to EL students.
To address students’ needs, the present study utilized formative assessment as a means to support
the teaching and learning of academic reading skills for EL students. We also endeavored to test our
underlying assumption that sound assessment tools would facilitate effective formative assessment
processes. In this study, we devised a technology-based assessment tool considering the increasing
use of technology in K-12 schools. As a small-scale, exploratory study, we examined the usability and
validity of the tool for formative purposes with three ESL teachers and their students (62 EL students)
from secondary schools. The results indicated that the tool had the potential to extend teachers’ and
Citation: Wolf, Mikyung Kim, and students’ formative assessment practices in principled ways. However, we also found some teachers’
Alexis A. Lopez. 2022. Developing a misconceptions about the tool’s purpose and their limited implementation skills to utilize the tool for
Technology-Based Classroom formative assessment purposes. Implications for practice and further research are discussed.
Assessment of Academic Reading
Skills for English Language Learners Keywords: academic reading skills; classroom assessment; English language learners; formative
and Teachers: Validity Evidence for assessment; technology-based language assessment; validity
Formative Use. Languages 7: 71.
https://doi.org/10.3390/
languages7020071
students develop appropriate English language proficiency for their academic success has
been a pressing issue in U.S. K-12 education.
As a powerful instructional strategy, formative assessment has been gaining renewed
interest among researchers and practitioners in contributing to learning outcomes (e.g.,
Black and Wiliam 2010; Gan and Leung 2019; Heritage 2013; Ruiz-Primo et al. 2014).
By definition, formative assessment involves an ongoing assessment process to identify
students’ status in relation to learning goals and to provide targeted instruction based
on individual students’ needs. Considering EL students’ diverse backgrounds in terms
of their language proficiency levels, formal schooling experience, length of US residence,
and cultures, formative assessment can be beneficial for EL students and their teachers to
provide tailored instruction to address various EL students’ needs.
For effective formative assessment, teachers’ assessment knowledge and implementa-
tion skills are essential. Previous research has underscored the importance of professional
development and support to realize the intent of formative assessment in practice (e.g., Le-
ung 2004; Tsagari and Vogt 2017). However, teachers’ busy schedules add to the challenges
of devising or selecting appropriate methods and planning for the execution of effective
formative assessment. Furthermore, not all teachers are equipped with the linguistic and
language instruction knowledge necessary to implement formative assessment for EL
students (Callahan 2013).
Against this backdrop, in this study, we attempted to devise a classroom-based as-
sessment tool to facilitate effective formative assessment processes for EL students and
their teachers. With the development of a prototype assessment tool and its usability study,
we aimed to examine the characteristics associated with effective formative assessment
in the context of EL education. We anticipated that study findings would provide useful
information to enhance such assessment tools and formative assessment processes, thereby
contributing to teaching and learning for EL students.
Figure 1.
Figure A formative
1. A formative assessment
assessment framework
framework to
to develop
develop and
and validate
validate the
the study
study assessment
assessment tool
tool for
for
formative use.
Considering
3. Academic the increased
Reading useReading
Skills and of technology and computer-based
Standards materials
in U.S. K-12 School in current
Settings
U.S. K-12 education, we built all of these elements in a computer-based learning man-
Since formative assessment should be an integral part of instruction, it is important
agement system. Despite a widespread use of technology in schools, technology-based,
to align a target construct of assessment with a school’s standards and/or curricula in the
classroom assessment tools for EL students for formative purposes are scant (Hamill et al.
context of K-12 education. The target construct of academic reading skills in this study
2019). By developing an assessment tool using technology features (e.g., computer-delivery,
multimedia integration, various item/task formats, immediate scoring/feedback), we in-
tended the tool to be easily integrated in current instructional settings. In addition, we
aimed to examine any technology-specific advantages or disadvantages that teachers and
students might encounter in using such a tool for formative assessment purposes.
3. Academic Reading Skills and Reading Standards in U.S. K-12 School Settings
Since formative assessment should be an integral part of instruction, it is important
to align a target construct of assessment with a school’s standards and/or curricula in the
context of K-12 education. The target construct of academic reading skills in this study was
driven by the reading standards adopted widely across U.S. K-12 schools (i.e., Common
Core State Standards). In this section, we briefly describe the reading standards on which
Languages 2022, 7, 71 4 of 22
this study was based and the academic reading skills expected of students, including EL
students, in U.S. K-12 school contexts.
As part of standards-based education reform, U.S. K-12 schools must adopt academic
content and English language proficiency standards. These standards then guide what is
taught and assessed. In efforts to prepare all students for college and careers, a nationwide
initiative took place in U.S. K-12 education a decade ago, resulting in a new set of standards
named the Common Core State Standards. Subsequently, many states and schools adopted
the Common Core State Standards or revised their existing standards to be similar to the
Common Core State Standards. As the name indicates, these standards feature a set of
core knowledge and skills for students to achieve in order to be college- and career-ready.
In addition to increased academic rigor, the standards expect students to demonstrate
sophisticated language use (Bailey and Wolf 2020; Bunch 2013). The ten reading standards
of the Common Core State Standards feature analyzing both complex informational and
literary texts. They also expect students to be able to integrate and evaluate information and
arguments from multiple texts. The ten reading standards are consistent from kindergarten
to Grade 12 with a different degree of complexity for each grade level. The reading skills
manifested in current standards in U.S. K-12 schools tend to focus on higher-level skills in
academic contexts. To support the instruction of EL students, schools’ English language
proficiency standards were also revised in accordance with the expectations delineated in
the Common Core State Standards, emphasizing academic language proficiency to handle
materials and tasks in school settings.
Ample research on reading skills has suggested that reading is a multicomponent
construct (e.g., Alderson 2000; Koda 2004; Sabatini et al. 2012). Broadly speaking, reading
skills involve both lower-level and higher-level skills (Grabe 2009; Saito and Inoi 2017).
Lower-level skills include foundational reading skills such as decoding and processing
sentence structures (Bernhardt 2011; O’Reilly and Sheehan 2009). Higher-level skills involve
building a mental model of comprehended texts including such skills as summarizing,
making inferences, and integrating multiple information beyond literal understanding
of texts (O’Reilly and Sheehan 2009; Saito and Inoi 2017). While academic reading skills
are commonly involved in higher-level skills, lower-level skills are essential to perform
higher-level skills. Although the current reading standards in U.S. K-12 schools focus on
higher-level reading skills, it is crucial to instruct and assess both lower- and higher-level
reading skills in order to support the needs of EL students with a wide range of English
language proficiency.
Figure 2. A
Figure 2. screenshot of the
A screenshot assessment
of the tool
assessment tool(learning
(learningmanagement
management system) home page
system) home pageand
and key
features.
key features.
AsAs describedininthe
described theprevious
previous section,
section,for
forthe
theassessment
assessment tool to be
tool to easily integrated
be easily integrated
with the school curriculum as part of their daily instruction, we selected one specific
with the school curriculum as part of their daily instruction, we selected one specific read-
reading standard as a learning goal. This standard from the Common Core State Standards
ingfor
standard
Englishas a learning
Language goal. This standard
Arts-Reading in Gradesfrom6–12 the Common
reads, “DelineateCoreand State Standards
evaluate the for
English Language
argument Arts-Reading
and specific claims in in Grades
a text, 6–12 reads,
including “Delineate
the validity and evaluate
of the reasoning as welltheasargu-
ment and specific claims in a text, including the validity of the reasoning
the relevance and sufficiency of the evidence”. (Common Core State Standards Initiative as well as the
relevance
2010, p.and35).sufficiency
The content ofofthe evidence”.
this (Common
standard was Core State
also reflected in someStandards Initiative
U.S. K-12 English2010,
language
p. 35). proficiency
The content standards.
of this standardThe wasELPA21 Englishinlanguage
also reflected some U.S. proficiency standards
K-12 English language
proficiency standards. The ELPA21 English language proficiency standards thatand
that are currently being used in nine states contain the following standard, “Analyze are cur-
critique
rently being the argument
used in nineofstates
otherscontain
orally and
the in writing”standard,
following for Grades“Analyze
6–8 (Councilandofcritique
Chief the
State School Officers 2014, p. 4). Based on these standards, we defined the learning goal
argument of others orally and in writing” for Grades 6–8 (Council of Chief State School
and target construct as being able to comprehend argumentative texts and evaluate authors’
Officers
claims2014,
along p.
with4).evidence.
Based on Wethese
further standards, we subconstructs
specified three defined the learning goal and
of foundational, literaltarget
construct as being and
comprehension, ablehigher-order
to comprehend argumentative
reading comprehension texts and
skills asevaluate authors’
a progression model. claims
along with
Then, evidence.
specific We further
tasks were devisedspecified threesubconstruct
to assess each subconstructs skill.ofTable
foundational,
1 summarizesliteral
thecom-
prehension, and and
subconstructs higher-order
task types reading comprehension
of this assessment tool. skills as a progression model. Then,
specific tasks were devised to assess each subconstruct skill. Table 1 summarizes the sub-
constructs and task types of this assessment tool.
The task name was shown for each item on the computer screen so that both students
and teachers were aware of the types of skills in which they were engaged. This design
was also intended to display a progression model so that teachers could easily identify
students’ current status and consider the next steps in relation to the overall learning goal
and targeted construct.
Table 1. The subconstructs and task types of the academic reading assessment tool.
Table 1. The subconstructs and task types of the academic reading assessment tool.
The task name was shown for each item on the computer screen so that both students
and teachers were aware of the types of skills in which they were engaged. This design
was also intended to display a progression model so that teachers could easily identify
students’ current status and consider the next steps in relation to the overall learning goal
and targeted construct.
As a way to facilitate students’ understanding of learning goals, students’ self-assessment
was constructed in alignment with the subconstruct skills. Self-assessment questions were
then embedded in the Learning Goals component shown in Figure 2. Some example of
self-assessment questions include Can you understand difficult words using clues in a text? Can
you understand the author’s main opinion or argument? Can you recognize and understand the
way that an author organized the text?
In developing assessment tasks, a few design features were applied to make this tool
specific to EL students’ needs. Those features included: (a) providing warm-up tasks in
order to activate EL students’ background knowledge on the topic of reading; (b) integrating
multiple language skills for students to unpack the passage to build comprehension (e.g.,
a text-to-speech feature, tasks requiring students to discuss and write about the reading
topic); and (c) providing scaffolded tasks while modeling a reading comprehension process
(e.g., sequencing the tasks to help students do a close reading of the text). Hence, several
warm-up tasks were designed for teachers to select at their discretion. For example, a warm-
up task asked students to talk in pairs on a topic relevant to the reading passage prior to
reading the passage. The main tasks in the assessment, then, began with introducing two
authors and telling students the purpose for reading. Figure 3 displays the screenshots of a
few sample tasks in the beginning of the Activities component to illustrate these features.
Although it is expected for students in Grades 6–8 to have foundational reading skills, tasks
covering foundational reading skills were embedded in the tool as one way of scaffolding
for EL students, considering their developing English proficiency. The assessment activities
consisted of two parts. Part 1 was designed to be completed collaboratively with a peer and
provided opportunities for a teacher to observe and interact with students while eliciting
evidence about EL students’ reading skills. Part 2 was an individual assessment based on
the same reading passages as in part 1. Appendix A provides some examples of part 1 and
part 2 tasks to illustrate reading tasks devised to assess each subconstruct skill.
Languages 2022, 7, x FOR PEER REVIEW 7 of 22
Languages 2022, 7, 71 7 of 22
Introduction
Figure 3.
Figure 3. Screenshots
Screenshots of
of the
the sample
sample tasks
tasks in
in the
the warm-up
warm-up and
and introductory
introductory section.
section.
Languages 2022, 7, 71 8 of 22
5. Research Questions
Upon the development of the assessment tool, we undertook a small-scale usability
study to examine the extent to which the tool was utilized for intended formative purposes.
This usability study was intended not only to collect validity evidence but also to inform
the areas of further modification of the tool. Based on the formative assessment framework
seen in Figure 1, we formulated research questions regarding the quality of assessment
tasks and the roles of teachers and students in using the tool. Specifically, we posited the
following research questions:
1. Are there differences in EL students’ performance by subconstruct?
2. How do teachers implement the computer-based reading assessment tool for for-
mative purposes? How do teachers interpret assessment results and use them for
instructional adjustments for EL students’ reading skills?
3. How do EL students perceive learning goals and feedback in a formative assess-
ment process?
4. How do teachers perceive the usefulness of the computer-based formative assessment
program? How do teachers perceive the quality of the formative assessment tasks in
the tool for the targeted reading comprehension construct?
The first research question was concerned with the quality of assessment tasks. We
were particularly interested in examining our underlying assumption that there would be a
linear pattern of students’ performance by subconstruct (i.e., foundational skill tasks would
be easier than those assessing higher-order comprehension skills). The other research ques-
tions were also anticipated to provide insights into factors associated with the usefulness
of the assessment tool for formative purposes.
6. Method
We employed a multiple case study approach (Baxter and Jack 2008) aimed at ex-
ploring differences and similarities in implementing formative assessment across different
contexts. As a small-scale, exploratory study, this design enabled us to closely examine
how individual teachers and students utilized our tool for formative assessment purposes.
Below we describe the participants, study instruments, procedure, and data analysis.
6.1. Participants
In order to try out the present prototype materials in various settings, we recruited
three ESL teachers from three secondary level schools. The teachers were teaching their EL
students in different settings: (1) self-contained class (i.e., an ESL teacher teaching English
language arts and English language development to a class of EL students), (2) push-in/co-
teaching (i.e., an ESL teacher with a content teacher in an English class comprised of both
EL and non-EL students), and (3) pull-out (i.e., an ESL teacher with a small group of EL
students pulled out from mainstream English language arts classes).
In regard to students, a total of 62 EL students participated in this study with 55%
students being female. Of the EL students, 89% spoke Spanish as their home language,
while six different home languages were spoken among the remainder. Their English
language proficiency ranged from low-intermediate to intermediate levels based on results
on state English language proficiency assessments. Table 2 presents the class characteristics
and the background of teachers included in this study.
Languages 2022, 7, 71 9 of 22
observation, each teacher participated in two interviews, one prior to the lesson and one
immediately afterward. All interviews were audio-recorded and transcribed. Regarding
students’ performance, the students’ responses on the selected-response questions were
machine-scored instantly and constructed-response questions were scored by teachers.
Based on the scores, a report was generated and accessible to teachers and students within
the tool’s ‘performance result’ section. During the lesson, the teachers helped students
review the results and complete all of the activities in the ‘next steps and resources’ sections
(plan for instruction, final self-assessment, and student survey). The students’ performance
data were also collected for analysis.
All qualitative data, including classroom observation notes, interview transcripts,
teachers’ reflection forms, and open-ended survey responses, were coded by a pair of
researchers, using a coding scheme we developed. The coding categories attempted
to capture evidence about teachers’ communication of learning goals, interpretation of
learning evidence, feedback activities, lesson adjustment based on assessment evidence,
perceptions about the assessment tool, and technology use. After independent coding, the
two researchers met to compare their coding and reached agreement through discussion.
For student survey and student performance data, descriptive statistics were computed.
7. Results
7.1. The Quality of Assessment Tasks
Although the validity and effectiveness of formative assessment lies in the assessment
process, the quality of assessment tasks should not be neglected. In addition to the reliability
coefficient (Cronbach’s α = 0.856), we examined the extent to which three subconstructs’
items discriminated students’ reading subskills. Figure 4 exhibits students’ average percent
correct on the items of each subconstruct reading skill. An expected performance pattern
was observed in which the difficulty of items increased by subconstruct, in the order of
foundational, literal comprehension, and higher-order comprehension skills. Interestingly,
Languages 2022, 7, x FOR PEER REVIEW
a wide range of performance variation was also observed in the higher-order reading 11 skills
of 22
Theaverage
Figure4.4.The
Figure averagepercent
percentcorrect
correctof
ofstudents’
students’performance
performanceon
on each
each subconstruct
subconstruct reading
reading skill.
skill.
While the
7.2. Teachers’ student score
Implementation of data provided
Formative some evidence about the quality of reading
Assessment
tasks to distinguish student levels in terms of three subskill levels, teachers’ comments
In this section, we summarize how three teachers (Sue, Paula, and Kris) utilized the
also testified to the quality of assessment tasks. All three teachers remarked that the
tool for formative assessment purposes, based on our coding of the qualitative data. We
organized the summaries with respect to three main elements of our formative assessment
framework.
reading tasks were well-designed and covered important reading skills necessary for
comprehending argumentative texts. For example, Sue commented, “I think every one of
those skills is important. Everything that you had in there [in the tool] is things that we
talk about, are things that the kids need”. More importantly, teachers provided specific
examples regarding their interpretations of students’ reading skills based on individual
tasks, indicating the quality of the given reading tasks for identifying students’ current
status related to the target construct. This point is further described in the next section as
part of teacher’s implementation skills.
Figure 5. Clarity and usefulness of learning goals and initial self-assessment for understanding
Figure 5. Clarity and usefulness of learning goals and initial self-assessment for understanding
learning
learning expectation
expectation for students.
for students.
In terms of the immediate feedback that was provided to the students while they
In terms of the immediate feedback that was provided to the students while they
completed the activities in part 1, half of them indicated in the survey that the feedback
Languages 2022, 7, x FOR PEERcompleted
REVIEW the activities in part 1, half of them indicated in the survey that the feedback
14 of 22
was clear and helpful in completing the activities (see Figure 6). However, there were a few
was clear and helpful in completing the activities (see Figure 6). However, there were a
students that indicated that the feedback was not clear (5.3%) or helpful (10.5%).
few students that indicated that the feedback was not clear (5.3%) or helpful (10.5%).
Students also indicated that the information in the performance results section (stu-
dent responses, correct responses, items scores, skills scores, total scores, feedback) was
useful in helping them understand what reading skills they had developed well (36.8%),
what reading skills they still need to work on (31.6%), and what plans to set for improving
specific reading skills (36.8%). The survey findings also suggest that many students still
needed support in interpreting and using their assessment results (see Figure 7).
Figure 6.
Figure Clarity and
6. Clarity and helpfulness
helpfulness of
of immediate
immediate feedback
feedback for
for completing
completing activities
activities for
for students.
students.
Students also indicated that the information in the performance results section (student
responses, correct responses, items scores, skills scores, total scores, feedback) was useful
in helping them understand what reading skills they had developed well (36.8%), what
reading skills they still need to work on (31.6%), and what plans to set for improving
specific reading skills (36.8%). The survey findings also suggest that many students still
needed support in interpreting and using their assessment results (see Figure 7).
Languages 2022, 7, 71 14 of 22
Figure 6. Clarity and helpfulness of immediate feedback for completing activities for students.
7.4. Teachers’ Perceptions about the Usefulness of the Tool and the Quality of Assessmenst Tasks
7.4. Teachers’ Perceptions about the Usefulness of the Tool and the Quality of Assessmenst Tasks
In this section, we summarize the three teachers’ comments on the usefulness of
In this section, we summarize the three teachers’ comments on the usefulness of the
the assessment tool, particularly regarding assessment tasks, specific features they found
assessment tool, particularly regarding assessment tasks, specific features they found use-
useful, and potential uses of the tool.
ful, and potential uses of the tool.
In general, teachers provided positive feedback on this prototype tool largely for
In general, teachers provided positive feedback on this prototype tool largely for
three reasons. The first reason was that the assessment tasks encompassed key skills
three reasons. The first reason was that the assessment tasks encompassed key skills that
that students would need for reading argumentative texts. This feature appeared to help
students would need for reading argumentative texts. This feature appeared to help teach-
teachers plan their lessons addressing individual EL students’ needs. The tasks were
ers plan theirdesigned
deliberately lessons addressing
for EL studentsindividual EL students’
to engage in from theneeds. The tasks
foundational towere deliber-
higher-order
ately designed for EL students to engage in from the foundational to
reading skills with some scaffolding embedded. Sue summarized the overall featurehigher-order readingof
skills with some scaffolding embedded. Sue summarized the overall feature of
the tasks as in the following, “I felt, as a group, they [students] had a good understanding the tasks
as what
of in the was
following, “I felt, as
being argued inaeach
group, they
text. [students]
I believe thishad
toolaalmost
good understanding of what
forces the students to
was being argued in each text. I believe this tool almost forces the students to
think about what they read by asking various questions that require them to go back to the think about
text to look for details and examples. By the time they finished these activities [reading
tasks], each student had read each paragraph more than once, so I’m sure that helped with
their comprehension skills”. Paula also explained how this tool allowed her to assess her
students reading skills, “ . . . It tells you like what they understood and what they did not
understand. What I probably need to go back and reteach. What I need to focus on, what
they had mastered also, so what I don’t need to spend time on. So, it does help gauge
instruction”. In addition, the teachers commented that the content of the activities in the
tool was generally appropriate, even though there were students at different language
proficiency levels in their classrooms.
Secondly, teachers valued EL-specific design features. In particular, they pointed out
the immediate feedback feature (e.g., “try again”), the collaborative tasks, and the special
supports and tasks for EL students (e.g., Buzzy the Bee reading aloud the directions to
the students, modeling what students need to do to complete an activity, highlighting
the words students did not know in the text, tasks attending to word formation and
sentence structures). Regarding collaborative tasks, Sue explained that her students were
accustomed to working in pairs and that the tasks in this assessment tool allowed for peer
collaboration. She commented that in classes with mixed English language proficiency
levels, ESL teachers were constantly strategizing grouping; for example, she typically paired
a lower-English proficiency level student with a higher-English proficiency level student so
they can support each other in their home language. She mentioned that low-level students
would benefit the most since they could ask their partners questions whenever they had
Languages 2022, 7, 71 15 of 22
difficulties or were unsure of what they needed to do. She commented that her students
were more willing to try to complete the tasks in small group environments than in a whole
class setting. Kris also valued the pair work by stating, “I liked the pair work because it
kept them [the students] on task. You know they had to have discussions in order to answer
it. And it kind of allowed them to monitor each other instead of just clicking through it. It
held them accountable”.
The third reason was attributed to technology features. Teachers asserted that their stu-
dents were highly engaged in the tool due to its presentation (e.g., visuals, read-aloud/text-
to-speech, different task formats) and students’ interactions with the tool (e.g., clicking,
highlighting, navigating across different sections, receiving immediate feedback, and view-
ing score reports at the end of the lesson). Sue said, “ . . . The activities are fun and engaging.
My students were into it [the assessment tool]. They did not feel they were completing
a test”.
Interestingly, observation notes indicated that teachers’ technology skills were varied
and had an impact on the ways that teachers adapted the lessons in real-time. For example,
Paula who was skillful at navigating the tool and the computer projector was able to switch
whole-group discussions and students’ pair work more seamlessly than other teachers
during each lesson. On the other hand, Sue (who acknowledged her limited computer
skills) had difficulties in modeling examples based on the given tasks to the class using the
program and computer projector, even though she noticed that many students struggled
with a similar task. Teachers also pointed out potential technical issues such as the limited
laptop availability and unstable internet connectivity in school.
In response to the questions about the potential usage of the tool, teachers indicated
that this tool could be used in multiple ways in addition to formative assessment purposes.
Their responses included the use of the tool as a main instructional material (i.e., curriculum
unit), a mini-lesson material, a supplemental material for small-group work, and an end-of-
unit test.
8. Discussion
In this study, we developed a technology-based assessment tool to help ESL teachers
carry out formative assessment practices in efficient and systematic manners. Through a
usability study of the prototype tool, we examined the ways in which the tool was used
by teachers and students for the intended formative assessment use. The results of the
usability study provided certain evidence to validate the tool for intended purposes.
Although the design of the tool contained four elements to reinforce the formative
assessment process, teachers’ implementation skills were inconsistent. In general, the
teachers in this study exhibited proficiency at analyzing and interpreting students’ per-
formance and assessment results, as exemplified in the results section. Considering the
prominent use of standardized assessments and accountability data reporting for the past
two decades in U.S. K-12 education, the teachers were familiar with analyzing the student
data. However, some limited implementation skills were noted in communicating learning
goals to students. The lack of communication about the learning goals seemingly narrowed
teachers’ immediate and near-term lesson planning only to the types of activities and
tasks the students would complete. Teachers’ reflection on students’ performance centered
on specific tasks and their associated reading skills (e.g., vocabulary, inferencing). That
is, teachers’ interpretation of learning evidence focused primarily on the current status,
rather than the gap between the current status and the overall learning goal. This tendency
led teachers to mainly articulate the types of tasks to work on as next steps, neglecting
plans based on individual EL students’ performance relative to the learning goal. This
finding suggests that teachers would benefit from further professional development on the
concept of formative assessment. In distinguishing formative assessment from summative
assessment, Heritage (2013) points out that teachers need to develop “a present-to-future
perspective, in which the concern is not solely with the actual level of performance but with
Languages 2022, 7, 71 16 of 22
anticipating future possibilities” for realizing the intent of formative assessment (Heritage
2013, p. 180).
In addition, we observed limited teacher feedback. As presented in the results section,
one teacher was more skillful at providing real-time feedback and making immediate lesson
adaptation than the other two teachers. Overall, there were few classroom conversations
where the teacher asked further probing questions contingent upon individual students’
responses on the given task in real time. The dialogues were primarily limited to completing
the tasks and classroom management. The literature on formative assessment stresses a
variety of methods of formative assessment including teacher questioning. Ruiz-Primo and
Furtak (2006) called them assessment conversations and provided empirical evidence of
the positive relationship between the quality of assessment conversations as a formative
assessment method and student learning outcomes. While we intended our tool for teachers
to practice formative assessment processes, teachers appeared to use it more as a summative
assessment tool, focusing on students’ completion to review the complete assessment results
at once. Teachers’ comments about using the tool as a unit-test or supplementary material
to assign to small group work align with the finding about teachers’ limited assessment
conversations in this study. This finding about teachers’ misconceptualization about their
use of summative assessment as formative assessment is also found in previous research
(e.g., De Lisle 2015; Heritage et al. 2009). In De Lisle’s (2015) study, for example, teachers
were skillful at recording and storing data, but neglecting the use of data, which was the
essential aspect of formative assessment. As Saito and Inoi (2017) note based on their
study, teachers’ clear understanding about the purposes of formative assessment is crucial
for the effective implementation of formative assessment. For further development of an
assessment tool, it may be worth including some examples of probing questions and a
vignette to describe the use of the tool for formative purposes. This addition may help raise
teachers’ metacognitive awareness about the intended use of formative assessment.
We deliberately designed the tool to involve students in the formative assessment
process (e.g., initial self-assessment, post-activity self-assessment upon reviewing the
performance results, feedback during and after the activities). It was interesting to find that
students found the learning goals useful to know, even though the clarity of the learning
goals remains an area for improvement. Students’ English language proficiency levels
and data interpretation skills seemed to hamper them from understanding the feedback
in the performance result section and planning on next steps. While the tool has room
for improvement, the finding was promising for students to take an agentive role in
understanding learning goals, reflecting on their performance, and feedback.
Although the study is limited in the small number of teachers with a short period of
lesson observation, the study findings offer useful implications for practice and further
research. We acknowledge that teachers’ limited use of the tool for formative assessment
purposes might have to do with their lack of understanding the intent of the tool. Thus,
the findings should be interpreted with caution. Yet, our three teachers’ varied and limited
implementation skills to use the tool for formative assessment corroborate previous re-
search findings about the importance of providing professional support to increase teachers’
language assessment literacy skills, particularly for the practice of formative assessment
(Fulcher 2012; Heritage 2007; Leung and Scott 2009; Tsagari 2016). In current U.S. K-12
education settings, a heavy reliance on large-scale standardized assessments for account-
ability appears to result in a focus of professional development on summative assessment.
While formative assessment has gained traction to balance the assessment system, profes-
sional support to increase teachers’ capacity to realize the benefit of formative assessment
warrants an increased emphasis.
It is also important to recognize teachers’ heavy workload and to have realistic ex-
pectations of teachers. Teachers constantly juggle tasks with limited time and resources
while ensuring coverage of the curriculum for all students. For ESL teachers who have to
deal with both English language proficiency and academic content standards to support EL
students’ academic success in mainstream classes, we argue that it is critical to provide a
Languages 2022, 7, 71 17 of 22
sound tool to alleviate teachers’ burden to devise methods to implement good formative
assessment practice in a sustained way. The tool developed in this study based on a the-
oretical framework provides one example. Although there was a limited use of the tool
for formative assessment in this study, the study findings revealed sources of misuse and
areas of further investigation. We call for further research on the development of a tool for
formative assessment and usability studies to inform best practices of utilizing such a tool
for effective formative assessment. Future research should also investigate student learning
outcomes as well as agents’ (teachers and students) behavioral changes that contribute to
effective formative assessment practice.
Author Contributions: Conceptualization, M.K.W.; Formal analysis, M.K.W. and A.A.L.; Investi-
gation, M.K.W. and A.A.L.; Supervision, M.K.W.; Writing—original draft, M.K.W. and A.A.L. All
authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: The study was conducted according to the guidelines of the
Committee for Prior Review of Research (CPRR) and approved by CPPR. The CPPR is Educational
Testing Service’s Institutional Review Board.
Informed Consent Statement: Informed consent was obtained from all teachers involved in the
study. For students involved in the study, their parent/guardian consent was obtained accompanied
by student assent.
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Acknowledgments: The authors would like to thank Michael Ecker, Chris Hamill, Keith Kiser, Maria
Konkel, Nathan Lederer, Jeremy Lee, Janet Stumper, and Jennifer Wain for their helpful research
Languages 2022, 7, x FOR PEER REVIEW
assistance in developing the tool and collecting the data for this study. 18 of 22
AppendixAA
Appendix
AppendixA.1.
Appendix A.1.Part
Part11Sample
SampleTasks
Tasks
FigureA1.
Figure A1.Sample
Sampletask
task 1 for
1 for the the subconstruct
subconstruct of foundational
of foundational skills type:
skills (Task (TaskWorking
type: Working with
with words).
words).
Languages 2022, 7, 71 18 of 22
Figure A1. Sample task 1 for the subconstruct of foundational skills (Task type: Working with
words).
FigureA3.
Figure A3.Sample
Sampletask
task3 3forfor
thethe subconstruct
subconstruct of literal
of literal comprehension
comprehension skillsskills
(Task(Task
type: type: Distin-
Distinguish-
guishing
ing factsopinions).
facts from from opinions).
Languages 2022, 7, 71 19 of 22
Figure A3. Sample task 3 for the subconstruct of literal comprehension skills (Task type: Distin-
guishing facts from opinions).
Figure A5.
Figure A5. Sample task 5 for the subconstruct of higher-order
higher-order comprehension
comprehension skills
skills (Task
(Task type:
type:
Workingwith
Working withargument
argumentstructure).
structure).
Figure A6. Sample task 6 for the subconstruct of higher-order comprehension skills (Task type:
Languages 2022, 7, 71 20 of 22
Figure A5. Sample task 5 for the subconstruct of higher-order comprehension skills (Task type:
Working with argument structure).
Figure A6.
Figure
Languages 2022, 7, x FOR PEER REVIEW A6. Sample
Sample task
task 66 for
for the
the subconstruct
subconstruct of
of higher-order
higher-order comprehension
comprehension skills
skills (Task
(Task type:
21 type:
of 22
Making connections).
Making connections).
Figure A7.Sample
FigureA7. Sampleitem
itemfor
forthe
thesubconstruct
subconstructofofliteral comprehension
literal skills
comprehension (Task
skills type:
(Task Understanding
type: Understand-
main ideas).
ing main ideas).
References
References
Alderson, J. Charles. 2000. Assessing Reading. Cambridge: Cambridge University Press.
(Alderson
Bailey, 2000)
Alison L., Alderson,
and Mikyung J. Charles. 2000.2020.
Kim Wolf. Assessing Reading. Cambridge:
The construct Cambridge
of English language University
proficiency Press.
in consideration of college and career
(Bailey and Wolf
readiness 2020) Bailey,
standards. Alison L.,English
In Assessing and Mikyung
LanguageKim Wolf. 2020.
Proficiency in The
U.S.construct of English
K-12 Schools. Editedlanguage proficiency
by Mikyung in consideration
Kim Wolf. New York:
of college and
Routledge, pp. career
36–54.readiness
[CrossRef] standards. In Assessing English Language Proficiency in U.S. K-12 Schools. Edited by Mikyung Kim
Baxter,Wolf. New
Pamela, York:
and Routledge,
Susan Jack. 2008. pp.Qualitative
36–54. https://doi.org/10.4324/9780429491689-3.
case study methodology: Study design and implementation for novice researchers.
(Baxter
Theand Jack 2008)
Qualitative Baxter,
Report Pamela,[CrossRef]
13: 544–56. and Susan Jack. 2008. Qualitative case study methodology: Study design and implementation
for Randy
Bennett, novice researchers. The Qualitative
E. 2011. Formative ReportA
assessment: 13:critical
544–56.review.
https://doi.org/10.46743/2160-3715/2008.1573.
Assessment in Education: Principles, Policy & Practice 18: 5–25.
(Bennett 2011)
[CrossRef] Bennett, Randy E. 2011. Formative assessment: A critical review. Assessment in Education: Principles, Policy & Practice
18: 5–25.
Bernhardt, http://doi.org/10.1080/0969594X.2010.513678.
Elizabeth B. 2011. Understanding Advanced Second-Language Reading. New York: Routledge.
(Bernhardt
Black, 2011)Dylan
Paul, and Bernhardt,
Wiliam. Elizabeth B. 2011. Understanding
1998. Assessment and classroom Advanced Second-Language
learning. Reading. New
Assessment in Education: York: Policy
Principles, Routledge.
& Practice 5: 7–74.
(Black and Wiliam 1998) Black, Paul, and Dylan Wiliam. 1998. Assessment and classroom learning. Assessment in Education: Principles,
[CrossRef]
Policy & Practice 5: 7–74, http://doi.org/10.1080/0969595980050102.
(Black and Wiliam 2010) Black, Paul, and Dylan Wiliam. 2010. Inside the black box: Raising standards through classroom assessment.
Kappan 92: 81–90.
(Bunch 2013) Bunch, George C. 2013. Pedagogical language knowledge: Preparing mainstream teachers for English learners in the
New Standards Era. Review of Research in Education 37: 298–341. https://doi.org/10.3102/0091732x12461772.
Languages 2022, 7, 71 21 of 22
Black, Paul, and Dylan Wiliam. 2010. Inside the black box: Raising standards through classroom assessment. Kappan 92: 81–90.
[CrossRef]
Bunch, George C. 2013. Pedagogical language knowledge: Preparing mainstream teachers for English learners in the New Standards
Era. Review of Research in Education 37: 298–341. [CrossRef]
Callahan, Rebecca M. 2013. The English Learner Dropout Dilemma: Multiple Risks and Multiple Resources. California Dropout Research
Project Report #19. Santa Barbara: University of California.
Common Core State Standards Initiative. 2010. Common Core State Standards for English Language Arts & Literacy in History/Social Studies,
Science, and Technical Subjects. Available online: http://www.corestandards.org/wp-content/uploads/ELA_Standards1.pdf
(accessed on 2 May 2021).
Council of Chief State School Officers. 2014. English Language Proficiency (ELP) Standards with Correspondences to K-12 English Language
Arts (ELA), Mathematics, and Science Practices, K-12 ELA Standards, and 6–12 Literacy Standards. Washington, DC: Council of Chief
State School Officers.
De Lisle, Jerome. 2015. The promise and reality of formative assessment practice in a continuous assessment scheme: The case of
Trinidad and Tobago. Assessment in Education: Principles, Policy & Practice 22: 79–103. [CrossRef]
Every Student Succeeds Act. 2015. 20 U.S.C. § 6301. Available online: https://www.congress.gov/114/plaws/publ95/PLAW-11
4publ95.pdf (accessed on 3 September 2021).
Fulcher, Glenn. 2012. Assessment literacy for the language classroom. Language Assessment Quarterly 9: 113–32. [CrossRef]
Gan, Zhengdong, and Constant Leung. 2019. Illustrating formative assessment in task-based language teaching. ELT Journal 74: 10–19.
[CrossRef]
Grabe, William. 2009. Reading a Second Language: Moving from Theory to Practice. Cambridge: Cambridge University Press.
Hamill, Christopher, Mikyung Kim Wolf, Yuan Wang, and Heidi Liu Banerjee. 2019. A Review of Digital Products for Formative Assessment
Uses: Considering the English learner Perspective. Research Memorandum No. RM-19-04. Princeton: Educational Testing Service.
Heritage, Margaret, Jinok Kim, Terry Vendlinski, and Joan Herman. 2009. From evidence to action: A seamless process in formative
assessment? Educational Measurement: Issues and Practices 28: 24–31. [CrossRef]
Heritage, Margaret. 2007. Formative assessment: What do teachers need to know and do? Phi Delta Kappan 89: 140–45. [CrossRef]
Heritage, Margaret. 2010. Formative Assessment and Next-Generation Assessment Systems: Are We Losing an Opportunity? Washington, DC:
Council of Chief State School Officers.
Heritage, Margaret. 2013. Gathering evidence of student understanding. In SAGE Handbook of Research on Classroom Assessment. Edited
by James H. McMillan. Thousand Oaks: SAGE Publications, Inc., pp. 179–95.
Koda, Keiko. 2004. Insights into Second Language Reading: A Cross-Linguistic Approach. Cambridge: Cambridge University Press.
Leung, Constant, and Bernard Mohan. 2004. Teacher formative assessment and talk in classroom contexts: Assessment as discourse
and assessment of discourse. Language Testing 21: 335–59. [CrossRef]
Leung, Constant, and Catriona Scott. 2009. Formative assessment in language education policies: Emerging lessons from Wales and
Scotland. Annual Review of Applied Linguistics 29: 64–79. [CrossRef]
Leung, Constant. 2004. Developing formative teacher assessment: Knowledge, practice, and change. Language Assessment Quarterly 1:
19–41. [CrossRef]
O’Reilly, Tenaha, and Kathleen M. Sheehan. 2009. Cognitively Based Assessment of, for, and as Learning: A 21st Century Approach for
Assessing Reading Competency. Research Memorandum No. RM-09-04. Princeton: Educational Testing Service.
Popham, W. James. 2008. Transformative Assessment. Alexandria: Association for Supervision and Curriculum Development (ASCD).
Rea-Dickins, Pauline. 2001. Mirror, mirror on the wall: Identifying processes of classroom assessment. Language Testing 18: 429–62.
[CrossRef]
Ruiz-Primo, Maria Araceli, and Erin Marie Furtak. 2006. Informal formative assessment and scientific inquiry: Exploring teachers’
practices and student learning. Educational Assessment 11: 237–63. [CrossRef]
Ruiz-Primo, Maria Araceli, and Min Li. 2013. Analyzing teachers’ feedback practices in response to students’ work in science
classrooms. Applied Measurement in Education 26: 163–75. [CrossRef]
Ruiz-Primo, Maria Araceli, Guillermo Solano-Flores, and Min Li. 2014. Formative assessment as a process of interaction through
language: A framework for the inclusion of English language learners. In Designing Assessment for Quality Learning, The Enabling
Power of Assessment. Edited by Claire Wyatt-Smith, Valentina Klenowski and Peta Colbert. Heidelberg: Springer, vol. 1, pp. 265–82.
Sabatini, John, Elizabeth Albro, and Tenaha O’Reilly. 2012. Measuring Up: Advances in How We Assess Reading Ability. Lanham: R&L
Education.
Saito, Hidetoshi, and Shin’ichi Inoi. 2017. Junior and senior high school EFL teachers’ use of formative assessment: A mixed-methods
study. Language Assessment Quarterly 14: 213–33. [CrossRef]
Schildkamp, Kim, Fabienne M. van der Kleij, Maaike C. Heitink, Wilma B. Kippers, and Bernard P. Veldkamp. 2020. Formative
assessment: A systematic review of critical teacher prerequisites for classroom practice. International Journal of Educational Research
103: 101602. [CrossRef]
Shepard, Lorrie A. 2009. Commentary: Evaluating the validity of formative and interim assessment. Educational Measurement: Issues
and Practice 28: 32–37. [CrossRef]
Sugarman, Julie, and Kevin Lee. 2017. Facts about English Learners and the NCLA/ESSA Transition in California. Washington, DC:
Migration Policy Institute.
Languages 2022, 7, 71 22 of 22
Tsagari, Dina, and Karin Vogt. 2017. Assessment literacy of foreign language teachers around Europe: Research, challenges and future
prospects. Papers in Language Testing and Assessment 6: 41–63.
Tsagari, Dina. 2016. Assessment orientations of primary state school EFL teachers in two Mediterranean countries. Center for Educational
Policy Studies Journal 6: 9–30. [CrossRef]
U.S. Department of Education, Office of English Language Acquisition [OELA]. 2021. English Learner Population by Local Educa-
tion Agency Fact Sheet; Washington, DC: OELA. Available online: https://ncela.ed.gov/files/fast_facts/20210315-FactSheet-
ELPopulationbyLEA-508.pdf (accessed on 3 August 2021).