0% found this document useful (0 votes)

12 views25 pages

[email protected] - 44

Uploaded by

Brayan Romero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views25 pages

[email protected] - 44

Uploaded by

Brayan Romero

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

2 Theory Visualizations for Bilingual Models

of Lexical Ambiguity Resolution

Ben Falandays and Michael J. Spivey

Introduction
We often think of bilingualism as “adding” a new and different language
system to one’s existing language system. There might be a tiny bit of
truth to this metaphor when learning a second language (L2) late in life.
However, it surely must be a poor metaphor for bilinguals who learn their
two languages relatively early in life. Rather than modeling bilingualism as
involving the “adding” of a second processor for the L2, perhaps bilingu-
alism itself can be treated as just another set of dimensions in the massive
state space in which a speaker’s linguistic representations are organized,
along with dimensions of situational context, text genre, grammatical
gender, linguistic register, syntax, phonology, semantics, and so on
(e.g., Onnis & Spivey, 2012). When these various aspects of language
are treated not as submodules within the language module but instead as
dimensions in a single state space, then suddenly new insights can be
gained in understanding how language is processed in general and how
bilinguals process lexical ambiguity in particular. In fact, the very concept
of a lexical representation changes dramatically when one switches from
a computer (or dictionary) metaphor of the lexicon to a dynamical system
account of word knowledge (Elman, 2004).
In this chapter, we review some connectionist models of bilingualism
and discuss how they might deal with lexical ambiguity; but, first, we
examine what lexical ambiguity itself “looks like” in the state space of
a language processing system. By treating the representational parameters
of a model as dimensions in a state space, the range of behaviors (and
regions visited) in that volumetric space can be identified more system-
atically. In a state space that combines a variety of linguistic aspects,
a word representation can be seen as extending not only across
a semantic field (e.g., Lehrer, 1974) but indeed across a lexical field that
combine the semantics, phonology, and situational context of how the
word is typically used (e.g., Elman, 2009; see also Lyons, 1963). By
studying the real-time temporal dynamics of lexical ambiguity resolution

Downloaded from https://www.cambridge.org/core. University of Toronto, on 02 Jan 2020 at 12:05:12, subject to the Cambridge Core terms of
use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/9781316535967.003
18 Theoretical and Methodological Considerations

in bilinguals (e.g., Altarriba & Gianico, 2003), it may be possible to better

see inside the structure of this state space in which language is
represented.
Traditional approaches to understanding lexical ambiguity resolution
relied heavily on the computer metaphor of the mind, positing a modular
processor for lexical access followed by a subsequent processor for context
effects (Swinney, 1979; Tanenhaus, Leiman, & Seidenberg, 1979).
Experiencing the uncertainty of reading or hearing a word like bug –
which could mean insect or spy device – was likened to activating two
separate dictionary entries, one of which would soon have to be deacti-
vated by the context processor for comprehension to be successful.
Rather than relying on this box-and-arrow computer metaphor, one can
instead treat the word bug as having one simple circumscribed region in
the phonological dimensions (since it is a homophone) but projecting
onto two disparate regions in the semantic dimensions of the massive
state space of language (since it has two rather different meanings). When
those phonological and semantic dimensions are combined to form one
phono-semantic state space, the region dedicated to the word bug is seen
as a single bounded, but very nonconvex, shape. In fact, when only certain
dimensions are shown and compressed just right, the lexical ﬁeld for bug
would look roughly shaped like a letter V, as in Figure 2.1(a). It is exactly
this nonconvexity of the shape that allows us an insight into what lexical
representations might “look like” in bilinguals. In this dynamical system
account, when hearing or reading the word bug, the human mind visits
portions of this bounded shape, and the other contextual dimensions
(some semantic, discourse, and situational dimensions not depicted
here) help push the state of the system toward one or the other arm of
that V-shape, to gradually achieve a contextually appropriate understand-
ing of the word.
However, not all ambiguous words have meanings that are unrelated to
one another, like bug. Take for example the verb dusted, in Sentence (2.1)
below.
(2.1) The chef dusted the cake with powdered sugar, but then the maid
dusted it clean.
The verb dusted is typically referred to as polysemous, rather than ambig-
uous, because its different meanings/usages are at least somewhat seman-
tically related to one another (Gibbs & Matlock, 2001). Yet instead of
treating ambiguous words and polysemous words as if they were cate-
gorically different phenomena, a dynamical-systems state space descrip-
tion allows one to visualize the graded similarity between the two
phenomena. Figure 2.1(b) shows how the lexical ﬁeld for dusted would

Phonological Dimension(s)
“bug”

(device)
(insect)

Semantic Dimension(s)
b
Phonological Dimension(s)

“dusted”

(cleaning) (baking)

Semantic Dimension(s)
c
Phonological Dimension(s)

“stup...”

(am
az
ing
(dumb) )

Semantic Dimension(s)

Figure 2.1 Theory visualizations of lexical ﬁelds in linguistic state space:

(a) lexical ambiguity involves a highly nonconvex shape that covers
unrelated regions of semantic space; (b) polysemy involves a relatively
more convex shape that includes interstitial regions of semantic space;
and (c) temporary phonological ambiguity, as with cohorts, often
involves a highly nonconvex shape again, one that heavily depends on
temporal dynamics

be somewhat similar to that for bug but notably different in that the
semantic regions used for its different meanings are spatially contiguous
with one another, allowing for blends across that semantic spectrum.

In addition to ambiguous words and polysemous words, another form

of lexical ambiguity arises temporarily during the first couple of hundred
milliseconds of hearing a spoken word. For example, halfway through
hearing the word candle, a listener will briefly exhibit partial activation of
a similar-sounding cohort word like candy (e.g., Marslen-Wilson, 1987;
McClelland & Elman, 1986) and will even look at a picture of a candy
before finally looking at the target object, a candle (Allopenna,
Magnuson, & Tanenhaus, 1998; Spivey-Knowlton, 1996). For example,
if one reads Sentence (2.2) out loud, a listener may find that the context
leading up to the first syllable in the final word steers one somewhat in the
direction of expecting it to turn out to be the word stupid instead of
stupendous; and Figure 2.1(c) provides a rough sketch of what that tem-
porally dynamic lexical field might look like in linguistic state space.
(2.2) In his ridiculous costumes, Sacha Baron Cohen looks just totally
stupendous.
We will revisit those temporally dynamic lexical fields later in our dis-
cussion, after we have reviewed some of the literature on how connec-
tionist models of bilingualism might address lexical ambiguity and the
literature on how bilinguals actually process spoken words. For now, we
return to temporally static treatments of linguistic state space.
When one considers the wide range of idiosyncratic linguistic experi-
ences that each language user undergoes, it seems clear that the topology
of any one person’s linguistic state space will be at least subtly different
from everyone else’s. Individual differences account for a substantial
amount of the variance in language learning and processing in both
monolinguals and bilinguals (e.g., Dörnyei, 2005; Grosjean, 1994). For
example, lexical ambiguity resolution has been shown to function rather
differently for people with high vs. low memory spans (Miyake, Just, &
Carpenter, 1994). (However, some of the variance attributed to memory
span might instead be explained by degree of language experience;
MacDonald & Christiansen, 2002.) People with high memory spans are
able to understand the correct meaning of boxer in Sentence (2.3) more
readily than people with low memory spans.
(2.3) Since Ken really liked the boxer, he took a bus to the pet store to
buy the animal.
Someone with a low memory span (or limited language experience with
English) might have a lexical field for the word boxer that, functionally
speaking, spans across only a narrow relatively convex range of semantic
space (Figure 2.2(a)). Thus, when reading the word boxer, that person
might automatically settle into the pugilist meaning of the word and then

a packaging

pugilism
factory

semantic dimensions
“boxer”

dog breeds

b packaging
pugilism

factory
semantic dimensions

“boxer”

dog breeds

more semantic dimensions

Figure 2.2 Individual differences in lexical fields: (a) a person with low
memory span or limited English experience would have a functionally
narrow lexical field for the word boxer, whereas (b) a person with high
memory span or extensive English experience would have a more
tentacular lexical field for boxer, with tendrils that stretch into a variety
of semantic spaces

encounter some difﬁculty understanding the rest of Sentence (2.3). By

contrast, someone with a high memory span (or extensive language
experience with English) might have a lexical field for boxer that stretches
out into a variety of regions of linguistic state space (Figure 2.2b).
Therefore, when reading the word boxer, that person might not settle
too deeply into any one tendril of that lexical field; and when the rest of
the sentence finally provides the disambiguating context, they are ready
and able to transition into the contextually appropriate region of semantic
space.
Similar to memory span and language experience, contextual diversity will
also introduce substantial individual differences in the topology of this
linguistic state space. Contextual diversity measures the frequency with
which a word occurs in significantly different contexts. Take, for example,

Academic
Fiction

“posterior”

contexts/genres

Newspapers Magazines

Fiction
Academic

b
contexts/genres

“piglet”

Newspapers Magazines

contexts/genres

Figure 2.3 Contextual diversity of lexical ﬁelds: (a) one region of

linguistic-genre space in which the lexical ﬁeld for posterior shows itself
to be nondiverse and rather convex; (b) another region of the same space
in which the lexical ﬁeld for piglet stretches itself nonconvexly into
diverse contexts

the words posterior and piglet. They both have the same overall lexical
frequency: 240 occurrences each in a 560 million word corpus.
Therefore, traditional approaches in psycholinguistics would predict that
these two words should exhibit equal latency in reading and reaction time
tasks (e.g., Forster & Chambers, 1973). However, about 60 percent of the
occurrences of posterior take place in academic texts, while only 10–15 per-
cent of its occurrences are in ﬁction, magazine, and newspaper contexts
each – and it almost never shows up in spoken contexts. Therefore, con-
textually speaking, posterior is a relatively nondiverse word. In our linguistic
state space framework, this would mean that its lexical ﬁeld is relatively
simple and convex (Figure 2.3a). As a result, if the language system started
out in a random or neutral location in state space, and was forced to
traverse its way to the region for posterior, it might have a long distance to
travel, thus producing a somewhat long response time. By contrast, the

word piglet has a much more evenly distributed pattern of occurrences

across these different contexts. Only 40 percent of its occurrences take
place in fiction, about 20 percent each in magazines and newspapers, and
10 percent each in academic and spoken contexts. Therefore, if the lan-
guage system started out in a random or neutral location in state space, and
was forced to travel to the piglet region, it would likely have a relatively short
distance to travel, and thus produce a short response time – even though it
has the same overall lexical frequency as posterior.
That data pattern is exactly what Adelman, Brown, and Quesada
(2006) found when they reanalyzed the data from six word identification
experiments. Contextual diversity predicted fast and slow response times
more robustly than did lexical frequency. Results like this have been
replicated and extended to word learning (Hills et al., 2010; Johns,
Dye, & Jones, 2016) and to eye movement measures of whole sentence
reading (Chen et al., 2017; Plummer, Perea, & Rayner, 2014). Evidently,
after decades of assuming that word frequency was a bedrock foundation
for psycholinguistics, it appears that the language system does not actu-
ally care how many times a lexical representation has been instantiated; it
cares how far in state space it has to travel right now in order to reach that
lexical representation.
Given these complex transformations of state space that are generated
by individual differences in working memory, or language experience, or
contextual diversity, just imagine the transformations that must take
place as a result of being bilingual. Rather than assuming that bilinguals
process lexical ambiguity in some categorically different way than mono-
linguals do, perhaps this graded range of idiosyncratic state space topol-
ogies (in Figures 2.1–2.3) allows one to consider a bilingual’s linguistic
state space as just another variety of these kinds of individual differences –
but an especially interesting one, to be sure. In this framework, almost
every word that a bilingual hears will have a few extra tendrils in its lexical
field, compared to a monolingual, that provide potential branchings-off
into different regions of linguistic state space. Now that we are equipped
with theory visualizations for the kinds of shapes that lexical ambiguity
can take in the state space of the language system, we turn to discussing
connectionist models of lexical ambiguity resolution and of bilingualism.

Parallel Distributed Processing Models of Word

Recognition
The bilingual interactive activation (BIA) and BIA+ models (Dijkstra & van
Heuven, 2002; van Heuven, Dijkstra, & Grainger, 1998) are extensions
of an earlier parallel distributed processing (PDP) model – the interactive

activation model (IAM) (McClelland & Rumelhart, 1981). Therefore, to

better understand how the BIA models function, it is worthwhile to first
discuss the IAM and related PDP models more generally. To that end,
this section describes the structure and mechanisms of PDP models in the
simplest case of unambiguous word recognition by monolinguals. Then,
the following section describes how PDP models account for lexical
ambiguity resolution. We then go on to describe how the BIA models
account for bilingual-specific phenomena involving homographs, homo-
nyms, cognates, and interlingual cohorts.
The IAM (McClelland & Rumelhart, 1981; Rumelhart & McClelland,
1982) is a multilevel connectionist architecture originally designed to
simulate the word superiority effect, a classic perceptual phenomenon
whereby identification of visually presented letters is faster when the
letters are inside words rather than nonwords (McClelland & Johnston,
1977). The logic underlying the IAM is that recognition of letters begins
first with recognition of basic visual features, followed by activation of
letters containing those features, which in turn activates words containing
those letters. The IAM simulates the word superiority effect by allowing
feedback connections from words to letters, such that recognition of
letters is facilitated when words become active.
Structurally, the IAM includes three layers of interconnected nodes:
a feature layer, a letter layer, and a word layer (see Figure 2.4a). The solid
lines with arrows indicate connections, while the dashed lines with circles
represent inhibitory connections. Current activation of each node is
represented by the thickness of the border around the node. The feature
layer contains nodes that become active in the presence of specific, simple
visual features, analogous to orientation selective cells in the visual sys-
tem. The letter and word layers contain nodes corresponding to all of the
known letters and words, respectively. In addition, letter position within
a string may be encoded as well, such that there is a node for each letter at
each possible position. Individual feature detectors have excitatory con-
nections with every letter in which they are found. Individual letters, in
turn, have excitatory connections with every word in which they are
found. Meanwhile, each word has inhibitory connections with all other
words; and, crucially, there are both excitatory and inhibitory feedback
connections from the word layer to the letter layer, such that word nodes
send excitation to any letter they contain and inhibition to any they do
not. It is these feedback connections that allow the model to reproduce
the word superiority effect.
On presentation of a visual stimulus, the features contained in the
string of letters first become active. The feature nodes then pass their
activation to nodes representing letters at their specified location in

the string. This creates a set of candidate letters that the system is
“considering,” consisting of all letters containing features present in
the input. The letter nodes then pass activation to words that are
consistent with those letters. Competition among words, via their
mutual inhibitory connections, results in the most highly consistent
word (or words, in the case that there is ambiguity) becoming more
active, while all other words are suppressed. The active word nodes
then pass activation to the letters they contain. In this way, letters that
are presented in the context of a word receive additional activation,
relative to letters in nonwords, which is how the model accounts for
the word superiority effect.
While the original IAM dealt primarily with visual word recognition,
the TRACE model (McClelland & Elman, 1986; and its reimplementa-
tion, jTRACE: Strauss, Harris, & Magnuson, 2007) extended the same
principles to model spoken word recognition. In TRACE, the letter layer
of the original IAM is replaced with a phoneme layer, and the feature layer
now consists of nodes responding gradiently to various acoustic dimen-
sions rather than visual features. Since speech unfolds over time, the input
to TRACE is a sequence of acoustic features, which stands in contrast to
the way that the visual IAM is presented with all visual information
simultaneously. As a result of this sequential presentation, even unambig-
uous speech input is temporarily ambiguous at the word level: Onsets are
consistent with many possible words and, as more of the input is received,
the pool of consistent words is narrowed until finally the offset leaves only
a single candidate.
In this way, TRACE captures the predictions of the Cohort model of
speech processing (Marslen-Wilson, 1987), which held that lexical
access occurs as a sequential search by method of elimination.
Importantly, lexical access in Cohort is all-or-none in that words that
are inconsistent with an onset are eliminated from consideration. As
a result, the Cohort model cannot recover in the case of degraded
information or mispronunciations. In contrast, TRACE is a continuous
mapping model, meaning that activation flows continuously between
layers, such that a given word unit can still receive activation, even if
some part of the input is inconsistent with it. As a result, TRACE
provides a better fit to the behavioral data, which shows, for example,
that listeners partially activate rhyme-cohorts that have a different
onset (e.g., making eye fixations to a speaker when the spoken input
is beaker; Allopenna et al., 1998). It is worth noting, however, that
TRACE does not provide a perfect fit to behavioral data: There is also
evidence that listeners partially activate anadromes – words with the
same phonemes in a different order (e.g., making eye fixations to a sub

when the input is bus; Toscano, Anderson, & McMurray, 2013).

TRACE encodes information about temporal ordering by including
copies of each phoneme node corresponding to each of the possible
positions in an input stream (which can be similarly implemented with
letter position in the IAM), but the aforementioned results suggest
this might not be perfectly representative of human speech processing.
Still, TRACE has stood the test of time as one of the best general
models of speech processing and is able to capture a wide range of
phenomena related to lexical ambiguity resolution, as we will discuss
in more detail in the next section.
These PDP models provide the foundation for the BIA models. The
earliest form of the BIA model (van Heuven et al., 1998) extended the
orthographic-only IAM (McClelland & Rumelhart, 1981) by the addition
of two lexicons and the aforementioned language nodes. The later BIA+
model (Dijkstra & van Heuven, 2002) made the conceptual addition of
phonological encoding, similar to that of TRACE (McClelland & Elman,
1986). The continuous mapping property of these interactive activation
models means that even unambiguous stimuli result in temporary uncer-
tainty in the network. As a result, these models extend gracefully to
ambiguous inputs, which are dealt with in the same fashion as unambig-
uous ones. In the next section, we examine the behavior of PDP models in
the speciﬁc case of lexical ambiguity resolution, and we show that they
again provide a robust ﬁt to behavioral data.

Lexical Ambiguity Resolution in PDP Models

Early behavioral evidence in lexical ambiguity resolution showed that
both meanings of an ambiguous word appear to become active at least
briefly, irrespective of preceding context, suggesting that lexical access
occurs first in a context-free stage of processing, followed by a second
stage of processing that integrates context (Swinney, 1979; Tanenhaus
et al., 1979). Later findings challenged those results. For example,
Tabossi (1988) demonstrated that in sentential contexts that are suffi-
ciently biasing, the contextually appropriate meaning of a homograph is
selectively activated. In the parlance of Figure 2.1(a), one can imagine
both a weak context that places the system in a region of the bug lexical
field that is roughly equidistant from its two semantic endpoints and
a strong context that places the system already deep into one of those
semantic endpoints. Vu, Kellas, and Paul (1998) extended Tabossi’s
findings by showing that multiple sources of contextual bias can be seen
to influence lexical access independently, such that priming of a target
word is influenced by a convergence of biases from multiple cues. These

and other studies began to swing the balance of evidence in favor of a PDP
type of account, and there is by now a very large body of work demon-
strating continuous bidirectional interaction between subsystems of the
language system (for review, see Spevack et al., 2018).
PDP models are able to capture the general pattern of behavioral data
regarding lexical ambiguity resolution. To understand how, let us con-
sider Kawamoto’s (1993) influential PDP model of lexical ambiguity
resolution, shown in Figure 2.4(b). In contrast to the IAM and TRACE
models discussed in the previous section, Kawamoto’s model makes use
of distributed rather than localist representations. Localist models have
a one-to-one mapping between nodes and represented entities as well as
hard-coded connections between entities. For example, the IAM and
TRACE have a single node for each feature, letter, and word, and the
connections between them are specified by the modeler in advance. In
those models, access of a lexical entry corresponds to activation of the
corresponding lexical node, and hence these models make it simple to
compare the activity of multiple word nodes over time.
Distributed models, on the other hand, encode representations as
a pattern of activity across many neurons that represent various features
or microfeatures. In Kawamoto’s (1993) model, each lexical entry corre-
sponds to a vector of activation values for 216 nodes, which are meant to
capture all features of a word: The first 48 nodes encode visual features
that define the orthography of the word; the next 48 nodes encode
phonetic features in specified positions, corresponding to pronunciation;
the next 24 nodes encode part-of-speech; and the last 96 nodes encode
meaning. While the total pattern across all nodes is unique with respect to
each lexical entry, each individual feature (meaning each possible value
for any of the 216 nodes) is consistent with several lexical entries. As
a result, the representation of each lexical entry is partially overlapping
with several other entries.
Another important difference between distributed and localist models
is that, in the former, the strength of connections between nodes must be
learned by the network, rather than coded by the researcher. Kawamoto’s
model is fully connected, meaning there are bidirectional links between
each of the 216 nodes. While it would, of course, be infeasible to manually
code all connections in a network of this size, this property of distributed
networks is actually a feature and not a bug: These models are intended to
capture developmental phenomena by teaching a lexicon to the network.
The network begins with connection strengths of 0 between all nodes.
During training, lexical entries (vectors of 216 activation values) are
presented to the network, which spreads activation according to its con-
nection strengths, eventually settling into a stable activity pattern.

Initially, this stable pattern will not match the target pattern correspond-
ing to the lexical entry, so an error correction algorithm is used to modify
the connection strengths after each training trial, bringing the output
closer to the target pattern. After training, features that co-occur in
a word develop stronger connections, such that when some subset of
a word’s features are presented to the network (e.g., only orthography
or pronunciation), the full pattern of activity for that lexical entry may
emerge in the network. A lexical entry has been accessed by the network
when the full pattern of activity settles into a stable state that matches
some lexical entry.
The behavior of the network can be best understood as operating in
a high-dimensional state space, where each node serves as a dimension,
and the activation of all nodes is a set of coordinates that describes the
location of the system in the state space at that point in time. When the
network is presented with an ambiguous word, its location in the state
space (i.e., its activation pattern) moves in a direction that is somewhat
toward both regions that belong to the two meanings of that word.
Gradually, as context and other factors bias the system’s interpretation
of this ambiguous word, the trajectory will curve toward the region in
state space that corresponds to the contextually appropriate meaning.
This nonlinear trajectory of the system, as it moves through state space,
can be mathematically described as following along the contours of an
energy landscape that is imposed on the volume of the state space by
external inputs, context, and its neural connectivity pattern of excitatory
and inhibitory synapses. This energy landscape describes how certain
regions of state space have a strong attracting force and other regions
may have a weak attracting force. Interspersed among these basins of
attraction in the state space are other regions that repel the system away
from them (peaks in the energy landscape). The simpliﬁed sketches of
basins of attraction in Figures 2.1–2.3 have associated with them energy
landscapes that make some portions of them more strongly attracting and
other portions less so. For instance, Figure 2.4(c) shows an example of an
energy landscape where the state space of the system would correspond to
the two-dimensional ﬂoor of that three-dimensional space, and the height
dimension corresponds to the potential energy of the system. Much like
a marble would roll with gravity and momentum, the state of the system
(indicated as a black circle on the manifold surface of Figure 2.4(c) rolls
down the energy landscape’s nonlinear slopes and settles into an attractor
basin (which corresponds to a location in space that belongs to a word’s
meaning).
In Kawamoto’s (1993) simulations, unambiguous words were recog-
nized (and settled in their energy landscapes) more quickly than biased

a b

Phonology Meaning
Word
ROOM BOOM
Layer Spelling Pt of Spch

Letter R B K F
Layer

Feature | \ [ ] –
/
Layer

Letter + – + + – – + – – – + + – + – –
Input R.
Input Vector

c d L1 L2
Node Node

Semantic
L1 L2 (or other, e.g.
Words Words sensorimotor)
Lexicon layer

Orthography Phonology
Subordinate
Layer Layer
Meaning

Dominant Visual Feature Acoustic Feature

Meaning Layer Layer

Visual Word-Form Input Speech Input

Figure 2.4 (a) McClelland and Rumelhart’s (1981) interactive activation

model processing the letter R; (b) Kawamoto’s (1993) PDP model of
lexical ambiguity resolution with a sample of all connections shown; (c)
an example energy landscape that determines the trajectory of a system
as it traverses its state space; and (d) Dijkstra and van Heuven’s (2002)
BIA+ model

ambiguous words (words having one sense that is more common or

dominant than the other). This makes sense because an unambiguous
word will have only one attractor basin, and a biased ambiguous word
(Figure 2.4c) will have two attractor basins, resulting in some competi-
tion or vacillation between those two regions in state space. Equi-biased
ambiguous words, however, were recognized even more slowly than the
biased ambiguous words, because, while those biased ambiguous words
have two attractor basins, one of them is much steeper/stronger than the

other. By contrast, the equi-biased ambiguous words have two attractor

basins that are nearly equal in strength, so the system takes longer to
ﬁnally settle into one of them. Importantly, Kawamoto found that sen-
tence context has a differential effect on biased and equi-biased ambig-
uous words. With equi-biased ambiguous words, context was highly
effective at tipping the balance and causing the system to settle into the
contextually appropriate attractor basin. However, with biased ambigu-
ous words, only a very strongly biasing context would be capable of
pushing the system toward the less common (or subordinate) meaning
of that word.
The work reviewed here illustrates the power of PDP models for
explaining lexical ambiguity resolution. Through the imagery of a high-
dimensional state space, with an energy landscape determining its
dynamics, it becomes clear how these models can capture both delayed
effects of context (Swinney, 1979; Tanenhaus et al., 1979) and early
effects of context (Tabossi, 1988; Vu et al., 1998).

Bilingual Interactive Activation

Although early theories of bilingual language processing proposed that
bilinguals could selectively activate one of their languages and deactivate
the other (Macnamara & Kushnir, 1971), the behavioral data now over-
whelmingly support a parallel interactive account, with both orthographic
or phonological input simultaneously activating representations in both
languages. For example, eye-tracking studies have shown that hearing
spoken words in one language can lead to eye fixations of a distractor
object whose name is phonologically similar in the task-irrelevant lan-
guage (Marian & Spivey, 2003a; Spivey & Marian, 1999). When
instructed to pick up the marker, Russian-English bilinguals frequently
look first at a stamp (called marka in Russian) before finally fixating the
marker (Spivey & Marian, 1999). Importantly, the magnitude of this
interlingual cohort effect is dependent on several factors, including language
experience (Weber & Cutler, 2004), phonetic featural similarity (Ju &
Luce, 2004), and recent use (Marian & Spivey, 2003b). Similar results
have been obtained for written input (De Groot, Delmaar, & Lupker,
2000; Dijkstra, Grainger, and van Heuven, 1999), with activation of
words in the irrelevant language being possible even when there is ortho-
graphic but no phonological overlap (Marian & Kaushanskaya, 2004) or
vice versa (Kaushanskaya & Marian, 2004). Furthermore, cross-
linguistic interference has been found to be dependent on the number
of orthographic neighbors of the target word in the nontarget language
(van Heuven, Dijkstra, & Grainger, 1998). Taken together, the

experimental evidence indicates that, for bilingual speakers, both ortho-

graphy and phonology can activate consistent words in both languages,
orthography activates phonology and vice versa, and there are important
roles for language history and stimulus characteristics (van Hell &
Tanner, 2012). As such, these results are consistent with a PDP account
of bilingual lexical processing, where multiple parameters are brought
together as dimensions in a high-dimensional state space (Onnis &
Spivey, 2012).
The BIA (van Heuven et al., 1998) and BIA+ (Dijkstra & van Heuven,
2002) were built on top of the original IAM (McClelland & Rumelhart,
1981) to deal with the case of bilingual language processing, in which
there are words from two or more languages that may overlap in features.
The BIA model (Figure 2.4d), like the IAM, includes layers with localist
nodes encoding features, letters, and words, respectively (although
a distributed-coding version of this model has been proposed; French,
1998; Jacquet & French, 2002). These layers work identically to that of
the IAM: feature nodes activate letters (in a speciﬁed position within the
word), nodes for letters in each position activate words with which they
are consistent, and word nodes have feedback connections with letter
nodes and inhibitory connection with all other word nodes. The BIA+
added additional layers for phonology and semantics (for simplicity,
hereafter our discussion will focus on this version of the model). The
lexicon, in the case of the BIA+, now includes words from two languages
instead of one. Importantly, this architecture models bilinguals as having
a uniﬁed lexicon: Letters activate words in both languages indiscrimi-
nately and words across languages retain inhibitory connections.
The most important difference between the IAM and BIA+ lies in the
addition of a top-most language layer. This layer includes two nodes – one
for each language – that have bidirectional excitatory connections with all
words in that language and inhibitory connections with all words in the
other language. This layer models the concept of a language mode, as
suggested by Grosjean (2001), whereby recent exposure to one language
will prime that language, resulting in processing costs when switching
languages (Altarriba et al., 1996; Meuter & Allport, 1999).
While lexical ambiguity in the monolingual case refers to intralingual
homonyms, homographs, and homophones, bilingual models need to
account for the addition of interlingual ambiguities as well. Cross-
language lexical ambiguities are functionally represented in the BIA
models by the inclusion of two separate word nodes, one in each lan-
guage, which differ in some of their connections to the orthography,
phonology, and semantic nodes, and exclusively activate their respective
language nodes. For the present purposes, interlingual ambiguities can be

divided into three classes. First, cognates are pairs of words that have the
same spelling and meaning in two languages. For example, actor has the
same meaning in English and Spanish but slightly different phonology. In
the model, the two word nodes corresponding to a pair of cognates will
have the same connections to the orthography and semantic nodes and
some of the same connections to the phonology layer (depending on the
degree of phonological similarity across the two languages). Orthographic
input to the model will activate both words equally, which will then
mutually inhibit each other via the inhibitory connections between all
words. Hence, this type of ambiguity cannot be resolved without help
from the language nodes. Prior unambiguous input to the model in one of
the two languages will selectively activate the corresponding language
node, which then acts to inhibit all word nodes in the opposing language.
This alters the starting activation levels of the word nodes, allowing the
node in the relevant language to more strongly inhibit its counterpart and
win the competition.
Next, false cognates, or interlingual homographs, are pairs of words
with the same spelling but different meanings in each language. For
example, main is a synonym for primary in English but in French means
hand (with a fairly different pronunciation). These would be represented
in the BIA models as word nodes in each language that are identical in
their connections to the orthography layer, partly different in their con-
nections to the phonology later, and completely different in their connec-
tions to the semantic layer. Ambiguity resolution in this case could occur
again by priming of the language nodes or instead through contextual
bias. If, for example, sentential context activates semantic nodes corre-
sponding to one of the two resolutions of the ambiguity, this alters the
initial state of the system to be closer to one option. A sufﬁciently biasing
sentential context, even in the nontarget language, could override the
inﬂuence of the language nodes, leading the system to correctly recognize
a code-switched word that does not match the language of the sentential
context. This is consistent with experimental evidence showing that the
processing cost of switching languages is dependent on contextual bias
(Li, 1996; Moreno, Federmeier, & Kutas, 2002). Furthermore, results
have shown that code-switching is easier when the phonology of the code-
switched word is different from that of the context language (Grosjean,
1995; Li, 1996). In the BIA models, this would be accounted for by the
fact that code-switched words with minimal phonological overlap with
the context language will activate fewer competitors in the context lan-
guage, leading to faster resolution.
Finally, partial cognates, or interlingual cohorts, are pairs of words
across languages in which there is partial overlap in spelling or phonology.

sharp

shark
phonological and semantic dimensions

k”
h ar
“s

sharp
shark

”
a rk
“sh
sharik

phono and semantic dimensions

Figure 2.5 (a) For a monolingual, the linguistic input shark has
orthographic and phonological similarity to both shark or sharp, and
a few other words; (b) For a bilingual, that same input has similarity
with even more lexical representations, thus producing an extremely
nonconvex lexical ﬁeld, and an even more nonlinear trajectory

For example, the English word shark is an interlingual cohort of sharik

(the Russian word for balloon). Because the bottom-up connections in the
BIA models are not language-selective, any input will send activation to
orthographic or phonological neighbors in both languages, and the degree
of competition in the network will be dependent on the number of
neighbors. Figure 2.5 uses the lexical-ﬁelds framework from Figures
2.1–2.3 to depict the regions of state space that can be visited while the
word shark is being presented to a monolingual English speaker (Figure
2.5a) or to a Russian-English bilingual (Figure 2.5b). For a monolingual,
the lexical ﬁeld (or energy landscape, for that word stretches into a few
different regions of semantic state space) and the trajectory (or activation

pattern over time) will be somewhat nonlinear as it curves slightly toward

the wrong word. By contrast, a bilingual’s lexical field stretches out into
several more regions of state space, resulting in an exceptionally curved
trajectory. While being presented with shark, the patterns of activation in
BIA+ would mimic this kind of state-space trajectory as it moves some-
what close to an interlingual cohort competitor before finally settling into
the correct pattern of activation.
However, feedback from the language nodes in the BIA+ model can
lead to asymmetric competition, whereby intralingual competitors in the
primed language will exert more influence than the interlingual competi-
tors from the other language. As an example from human data, when
Marian and Spivey (2003a) placed Russian-English bilinguals into
a relatively monolingual Russian language mode (with a consent form in
Russian, the experimenter speaking only native Russian, and Russian
music in the background), those participants exhibited substantial lexical
competition from intralingual competitors in Russian but not as much
from interlingual competitors in English. Ultimately, however, resolution
in the case of partial cognates will reliably be accomplished in the BIA+
model purely through bottom-up information (without the need for con-
text), since words in either language with partially inconsistent orthogra-
phy or phonology will receive less activation than the target word.
As revealed by Kawamoto’s (1993) model, contextual priming in
the BIA+ model can also influence the initial state of the system and
thus bias it toward one meaning of an ambiguous word. For example,
Schwartz and Kroll (2006) found that, when sentence context was
weak, cognates were processed faster by bilingual participants than
words in only one language, indicating that lexical representations
from both languages affected processing. However, when sentence
context was strong, this effect disappeared, suggesting that compre-
hension was guided selectively to the meaning in only one of the
languages (see also Libben & Titone, 2009).
Because work using the BIA models has not specifically focused on
simulating lexical ambiguity resolution tasks, it is important to note that
the account we have given here is somewhat speculative in nature.
However, with a general understanding of PDP principles and the struc-
ture of the BIA models, we expect that this account will by now be
intuitively clear. Since the BIA models allow parallel, bottom-up activa-
tion of words in both languages, interlingual ambiguities are really not
that different from intralingual ambiguities with monolinguals. How
quickly the system can resolve these ambiguities, and which resolution
ultimately wins, is dependent on the starting state of the system – via
priming of language or semantic nodes – and the overall energy

landscape, which determines the degree of attraction toward various

outcomes.

Discussion
Obviously, bilingualism does not involve having a new and separate
lexicon module inserted into a person’s cortex. Learning an L2, early or
late in life, involves rewiring the existing connectivity of multiple language
areas of the brain. This network of networks has some portions of it that
are mostly specialized for one or the other language (Kim et al., 1997),
but it also has many portions that are used by both languages (Marian,
Spivey, & Hirsch, 2003). The BIA and BIA+ models of bilingual lan-
guage processing have pursued that general kind of architecture and
produced results that correspond well with human data (Dijkstra & van
Heuven, 2002; van Heuven et al., 1998). As a result of that type of cortical
connectivity in a bilingual, reading or hearing a word from one language
can inadvertently partially activate a lexical representation in the other
language. It turns out that this process in bilinguals is not that different
from related processes in monolinguals. When monolinguals read or hear
a word in their language, they also exhibit inadvertent partial activation of
other related lexical representations.
Rather than thinking of these lexical representations as line entries in
a mental dictionary, some of which get partially activated, we have chosen
a different framework here for understanding how ambiguity (temporary
or otherwise) causes the language system to vacillate between multiple
possible interpretations. We have chosen a state-space framework,
wherein lexical representations exist as attractor basins, some with
a strong or weak pull, some with partial overlap with one another, and
some with tendrils that stretch out to semantically disparate regions of
state space. Those tentacular lexical attractor basins, whose tendrils reach
out in many directions in state space, may be unusually prevalent in
bilinguals, compared to monolinguals.
While the dictionary framework is clearly a metaphor, intended to help
one imagine how words might be organized in the language system, the
state-space framework need not be conceived as a metaphor (Onnis &
Spivey, 2012). When one takes a neural network, such as a brain or
connectionist model, and treats each node’s activation as a coordinate
in a state space, this serves as a mathematical description of the state of the
actual system (Elman, 2004, Spivey, 2008) – not a metaphor. Scientiﬁc
metaphors always break down at some point and can provide misleading
insights (Hoffman, 1980). In the case of a simulated neural network
processing two languages, as its state changes from timestep to timestep,

one can access all the data necessary to provide an accurate state-space
description of this system – perhaps performing a dimensionality reduc-
tion down to two or three dimensions for purposes of data visualization
(Elman, 1991). In the case of an actual brain-and-body processing two
languages, however, it is of course not possible to measure every node in
the network. Nonetheless, we can measure quite a bit; and, when the right
behavioral measures are chosen carefully and sampled as continuously as
possible (e.g., Louwerse et al., 2012), those behaviors can be seen as
performing something similar to the dimensionality reduction performed
on the simulated neural network, thus allowing us to witness a low-
dimensional record of the high-dimensional mental trajectory (Spivey &
Dale, 2006, p. 209). Importantly, even with quantitatively abstracted
data, from recorded behaviors that result from hidden neural processes,
we are still not using a metaphor when we plot those data into a state space
for data visualization. The neural dimensions have been reduced by the
motor system in poorly understood ways, but there is no ﬁgurative ana-
logy being used to liken linguistic processes to something else, such as
a book with lexical entries listed in alphabetical order.
In this chapter, we have provided a series of theory visualizations as
proxies for those data visualizations. Armed with state-space trajec-
tories of connectionist networks addressing lexical ambiguity resolu-
tion in monolingual conditions and in bilingual conditions, one can
see that the attractor basins corresponding to word representations
come in a wide variety of shapes and sizes. Bilingualism may not
instigate a qualitatively different format of processing but instead
may just introduce a quantitative change in the distribution of those
different shapes and sizes. Compared to monolinguals, bilinguals may
experience a little more phonological (and in some cases ortho-
graphic) overlap in their lexical ﬁelds, which may result in a little
more lexical competition on a regular basis. Perhaps it is this incessant
practice with increased lexical competition that trains a bilingual’s
brain to have greater cognitive control (e.g., Kroll & Bialystok,
2013; Spivey & Cardon, 2015). If one must have a metaphor,
then – far from being a dictionary – the mental lexicon is perhaps
more like a high-dimensional golf course with sandpits, greens, and
fairways all interlacing among one another; and a bilingual’s golf
course is especially tangled.

Keywords
Ambiguous words, Bilingual interactive activation (BIA) model,
Bilingual interactive activation Plus (BIA+) model, Bilingual lexical

processing, Code-switching, Cognates, Cohort model, Connectionist

models, Contextual diversity, Continuous mapping model, Cross-
linguistic interference, Distributed networks, Distributed representation,
Dynamical-systems state space, False cognates, Feature layer,
Homographs, Homonyms, Homophone, Individual differences,
Inhibitory connections, Interlingual ambiguities, Interlingual cohort
effect, Interactive activations model (IAM), jTRACE, Language experi-
ence, Language mode, Language module, Letter layer, Lexical access,
Lexical ambiguity, Lexical entries, Lexical-ﬁelds framework, Localist
models, Microfeatures, Parallel distributed processing (PDP), Partial
cognates, Phoneme layer, Phonetic featural similarity, Phono-semantic
state space, Phonologically similar, Polysemous, Rhyme-cohorts,
Semantic ﬁeld, Semantics, Semantics, Sentence context, Sequential
search, Situation context, Theory visualizations of bilingual lexical ambi-
guity, Theory Visualizations of Lexical Fields, TRACE, Word superiority
effect

Thought Questions
1. What are the pros and cons of localist versus distributed connec-
tionist models of bilingualism? Is one more appropriate than the
other?
2. Age of acquisition is not modeled in the BIA or BIA+ but is known to
have important effects. How might age of acquisition be integrated with
these models?
3. Views of embodied cognition (e.g., Barsalou, 2008) suggest that
action, planning, and sensorimotor representations may also play roles
in language processing. How might these or other cues inﬂuence bilingual
processing?

Internet Sites
Connectionism: www.ucs.louisiana.edu/~isb9112/dept/phil341/wisconn
.html
Connectionism as an Approach: www.iep.utm.edu/connect/
Bilingual Interactive Activation Plus: www.wikivisually.com/
wiki/Bilingual_interactive_activation_plus
Interactive Activation Models: www.psychology.nottingham.ac.uk/staff/
wvh/jiam/
What is Connectionism: www.mind.ilstu.edu/curriculum/connectionism_
intro/connectionism_1.php

Further Reading
Dale, R., Fusaroli, R., Duran, N. D., & Richardson, D. C. (2013). The
self-organization of human interaction. In Psychology of Learning and
Motivation, 59, 43–95.

References
Adelman, J. S., Brown, G. D., & Quesada, J. F. (2006). Contextual diversity, not
word frequency, determines word-naming and lexical decision times.
Psychological Science, 17(9), 814–823.
Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the
time course of spoken word recognition using eye movements: Evidence for
continuous mapping models. Journal of memory and language, 38(4), 419–439.
Altarriba, J., & Gianico, J. L. (2003). Lexical ambiguity resolution across
languages: A theorical and empirical review. Experimental Psychology, 50(3),
159–170.
Altarriba, J., Kroll, J. F., Sholl, A., & Rayner, K. (1996). The influence of lexical
and conceptual constraints on reading mixed-language sentences: Evidence
from eye fixations and naming times. Memory and Cognition, 24(4), 477–492.
Barsalou, L. W. (2008). Grounded Cognition. Annual Review of Psychology, 59,
617–645.
Chen, Q., Huang, X., Bai, L., Xu, X., Yang, Y., & Tanenhaus, M. K. (2017). The
effect of contextual diversity on eye movements in Chinese sentence reading.
Psychonomic Bulletin and Review, 24(2), 510–518.
De Groot, A. M., Delmaar, P., & Lupker, S. J. (2000). The processing of
interlexical homographs in translation recognition and lexical decision:
Support for non-selective access to bilingual memory. The Quarterly Journal of
Experimental Psychology, 53A(2), 397–428.
Dijkstra, T., Grainger, J., & van Heuven, W. J. (1999). Recognition of cognates
and interlingual homographs: The neglected role of phonology. Journal of
Memory and language, 41(4), 496–518.
Dijkstra, T., & van Heuven, W. J. (2002). The architecture of the bilingual word
recognition system: From identification to decision. Bilingualism: Language and
Cognition, 5(3), 175–197.
Dörnyei, Z. (2005). The psychology of the language learner: Individual differences
in second language acquisition. Routledge.
Elman, J. L. (1991). Distributed representations, simple recurrent networks, and
grammatical structure. Machine Learning, 7(2–3), 195–225.
Elman, J. L. (2004). An alternative view of the mental lexicon. Trends in Cognitive
Sciences, 8(7), 301–306.
Elman, J. L. (2009). On the meaning of words and dinosaur bones: Lexical
knowledge without a lexicon. Cognitive Science, 33(4), 547–582.
Forster, K. I., & Chambers, S. M. (1973). Lexical access and naming time.
Journal of Memory and Language, 12(6), 627–635.
French, R. M. (1998). A simple recurrent network model of bilingual memory. In
M. A. Gernsbacher & S. J. Derry (Eds.), Proceedings of the 20th Annual Cognitive

Science Society Conference (pp. 368–373). Hillsdale, NJ: Lawrence Erlbaum

Associates.
Gibbs, R., & Matlock, T. (2001). Psycholinguistic perspectives on polysemy. In
H. Cuyckens & B. Zawada (Eds.), Polysemy in cognitive linguistics. (pp.
213–239). Amsterdam: John Benjamins.
Grosjean, F. (1994). Individual bilingualism. In The encyclopedia of language and
linguistics (pp. 1656–1660). Oxford: Pergamon Press.
Grosjean, F. (1995). A psycholinguistic approach to code-switching: The
recognition of guest words by bilinguals. In L. Milroy & P. Muysken (Eds.),
One speaker, two languages (pp. 259–275). Cambridge: Cambridge University
Press.
Grosjean, F. (2001). The bilingual’s language modes. In J. Nicol (Ed.), One mind,
two languages: Bilingual language processing (pp. 1–22). Oxford: Blackwell.
Hills, T. T., Maouene, J., Riordan, B., & Smith, L. (2010). The associative
structure of language: Contextual diversity in early word learning. Journal of
Memory and Language, 63(3), 259–273.
Hoffman, R. R. (1980). Metaphor in science. In R. P. Honeck & R. R. Hoffman
(Eds.), The psycholinguistics of figurative language. Hillsdale, NJ: Lawrence
Erlbaum Associates.
Jacquet, M., & French, R. M. (2002). The BIA++: Extending the BIA+ to
a dynamical distributed connectionist framework. Bilingualism: Language and
Cognition, 5(3), 202–205.
Johns, B. T., Dye, M., & Jones, M. N. (2016). The influence of contextual
diversity on word learning. Psychonomic Bulletin and Review, 23(4),
1214–1220.
Ju, M., & Luce, P. A. (2004). Falling on sensitive ears: Constraints on bilingual
lexical activation. Psychological Science, 15(5), 314–318.
Kaushanskaya, M., & Marian, V. (2004). Activation of non-target language
phonology during bilingual visual word recognition: Evidence from eye-
tracking. In K. Forbus, D. Gentner, & T. Regier (Eds.), Proceedings of the 26th
Annual Meeting of the Cognitive Science Society (pp. 654–659). Hillsdale, NJ:
Lawrence Erlbaum Associates.
Kawamoto, A. H. (1993). Nonlinear dynamics in the resolution of lexical
ambiguity: a distributed processing account. Journal of Memory and Language,
32, 474–516.
Kim, K. H., Relkin, N. R., Lee, K. M., & Hirsch, J. (1997). Distinct cortical areas
associated with native and second languages. Nature, 388(6638), 171–174.
Kroll, J. F., & Bialystok, E. (2013). Understanding the consequences of
bilingualism for language processing and cognition. Journal of Cognitive
Psychology, 25(5), 497–514.
Lehrer, A. (1974). Semantic fields and lexical structure, Amsterdam: John
Benjamins.
Li, P. (1996). Spoken word recognition of code-switched words by Chinese–
English bilinguals. Journal of Memory and Language, 35(6), 757–774.
Libben, M. R., & Titone, D. A. (2009). Bilingual lexical access in context:
evidence from eye movements during reading. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 35(2), 381–390.

Louwerse, M. M., Dale, R., Bard, E. G., & Jeuniaux, P. (2012). Behavior
matching in multimodal communication is synchronized. Cognitive Science, 36
(8), 1404–1426.
Lyons, J. (1963). Structural semantics. Oxford: Blackwell.
MacDonald, M. C., & Christiansen, M. H. (2002). Reassessing working
memory: Comment on Just and Carpenter(1992) and Waters and Caplan
(1996). Psychological Review, 109(1), 35–54.
Macnamara, J., & Kushnir, S. L. (1971). Linguistic independence of bilinguals:
The input switch. Journal of Memory and Language, 10(5), 480.
Marian, V., & Kaushanskaya, M. (2004). Self-construal and emotion in bicultural
bilinguals. Journal of Memory and Language, 51(2), 190–201.
Marian, V., & Spivey, M. (2003a). Bilingual and monolingual processing of
competing lexical items. Applied Psycholinguistics, 24(2), 173–193.
Marian, V., & Spivey, M. (2003b). Competing activation in bilingual language
processing: Within-and between-language competition. Bilingualism: Language
and Cognition, 6(2), 97–115.
Marian, V., Spivey, M., & Hirsch, J. (2003). Shared and separate systems in
bilingual language processing: Converging evidence from eyetracking and brain
imaging. Brain and Language, 86(1), 70–82.
Marslen-Wilson, W. D. (1987). Functional parallelism in spoken
word-recognition. Cognition, 25(1–2), 71–102.
McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech
perception. Cognitive Psychology, 18(1), 1–86.
McClelland, J. L., & Johnston, J. C. (1977). The role of familiar units in
perception of words and nonwords. Perception and Psychophysics, 22(3),
249–261.
McClelland, J. L., & Rumelhart, D. E. (1981). An interactive activation model of
context effects in letter perception: I. An account of basic findings. Psychological
Review, 88(5), 375–407.
Meuter, R. F., & Allport, A. (1999). Bilingual language switching in naming:
Asymmetrical costs of language selection. Journal of memory and language, 40
(1), 25–40.
Miyake, A., Just, M. A., & Carpenter, P. A. (1994). Working memory constraints
on the resolution of lexical ambiguity: Maintaining multiple interpretations in
neutral contexts. Journal of Memory and Language, 33(2), 175–202.
Moreno, E. M., Federmeier, K. D., & Kutas, M. (2002). Switching languages,
switching palabras (words): An electrophysiological study of code switching.
Brain and Language, 80(2), 188–207.
Onnis, L., Spivey, M. J. (2012). Toward a new scientific visualization for the
language sciences. Information, 3, 124–150.
Plummer, P., Perea, M., & Rayner, K. (2014). The influence of contextual
diversity on eye movements in reading. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 40(1), 275–283.
Rumelhart, D. E., & McClelland, J. L. (1982). An interactive activation
model of context effects in letter perception: II. The contextual
enhancement effect and some tests and extensions of the model.
Psychological Review, 89(1), 60–94.

Schwartz, A. I., & Kroll, J. F. (2006). Bilingual lexical activation in sentence

context. Journal of Memory and Language, 55(2), 197–212.
Spevack, S. C., Falandays, J. B., Batzloff, B., & Spivey, M. J. (2018). Interactivity
of language. Language and Linguistics Compass, 12(7), e12282.
Spivey, M. J. (2008). The continuity of mind. New York: Oxford University Press.
Spivey, M. J., & Cardon, C. D. (2015). Methods for studying adult bilingualism.
In J. Schwieter (Ed.), The Cambridge handbook of bilingual language processing.
(pp. 108–132). New York: Cambridge University Press.
Spivey, M. J., & Dale, R. (2006). Continuous dynamics in real-time cognition.
Current Directions in Psychological Science, 15(5), 207–211.
Spivey, M. J., & Marian, V. (1999). Cross talk between native and second
languages: Partial activation of an irrelevant lexicon. Psychological Science, 10
(3), 281–284.
Spivey-Knowlton, M. J. (1996). Integration of visual and linguistic information:
Human data and model simulations. Unpublished doctoral dissertation,
University of Rochester.
Strauss, J., Harris, H. D., & Magnuson, J. S. (2007). jTRACE:
A reimplementation and extension of the TRACE model of speech perception
and spoken word recognition. Behavior Research Methods, 39(1), 19–30.
Swinney, D. A. (1979). Lexical access during sentence comprehension: (Re)
consideration of context effects. Journal of Verbal Learning and Verbal Behavior,
18(6), 645–659.
Tabossi, P. (1988). Accessing lexical ambiguity in different types of sentential
contexts. Journal of Memory and Language, 27, 324–340.
Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S. (1979). Evidence for
multiple stages in the processing of ambiguous words in syntactic contexts.
Journal of Verbal Learning and Verbal Behavior, 18(4), 427–440.
Toscano, J. C., Anderson, N. D., & McMurray, B. (2013). Reconsidering the role
of temporal order in spoken word recognition. Psychonomic Bulletin and Review,
20(5), 981–987.
van Hell, J. G., & Tanner, D. (2012). Second language proﬁciency and cross-
language lexical activation. Language Learning, 62, 148–171.
van Heuven, W. J., Dijkstra, T., & Grainger, J. (1998). Orthographic
neighborhood effects in bilingual word recognition. Journal of Memory and
Language, 39(3), 458–483.
Vu, H., Kellas, G., & Paul, S. T. (1998). Sources of sentence constraint on lexical
ambiguity resolution. Memory and Cognition, 26(5), 979–1001.
Weber, A., & Cutler, A. (2004). Lexical competition in non-native spoken-word
recognition. Journal of Memory and Language, 50(1), 1–25.

The Mental Lexicon
No ratings yet
The Mental Lexicon
26 pages
Cognitive Model of Verbs of Speech Perception
No ratings yet
Cognitive Model of Verbs of Speech Perception
4 pages
Kornai2020 Chapter Lexemes
No ratings yet
Kornai2020 Chapter Lexemes
28 pages
6 Semantics and Pragmatics
No ratings yet
6 Semantics and Pragmatics
77 pages
Cognition: A B A C A D A e
No ratings yet
Cognition: A B A C A D A e
24 pages
What Is A Word?
No ratings yet
What Is A Word?
15 pages
Unit 2
No ratings yet
Unit 2
77 pages
Pages From EZ607-ok-Vocabulary - Semantics - and - Language - Education
No ratings yet
Pages From EZ607-ok-Vocabulary - Semantics - and - Language - Education
4 pages
L8-Semantics Postclass
No ratings yet
L8-Semantics Postclass
29 pages
Introduction To NLP and Ambiguity
No ratings yet
Introduction To NLP and Ambiguity
42 pages
Vocabulary: Applied Linguistic Perspectives
No ratings yet
Vocabulary: Applied Linguistic Perspectives
4 pages
Lexicon: Citation Needed
No ratings yet
Lexicon: Citation Needed
6 pages
Cornelia Hamann: Handout Based On Word by Cornelia Hamann and Geneveva Puskas (Syntax)
No ratings yet
Cornelia Hamann: Handout Based On Word by Cornelia Hamann and Geneveva Puskas (Syntax)
33 pages
Lexical Semantics Overview
No ratings yet
Lexical Semantics Overview
54 pages
Previous Researches On Lexical Ambiguity and Polysemy
No ratings yet
Previous Researches On Lexical Ambiguity and Polysemy
14 pages
Online Receipt Attached (Please Delete As Appropriate)
No ratings yet
Online Receipt Attached (Please Delete As Appropriate)
16 pages
A Multidimensional Framework For Evaluating Lexical Semantic Change With Social Science Applications
No ratings yet
A Multidimensional Framework For Evaluating Lexical Semantic Change With Social Science Applications
26 pages
Analysing Verbal Data: Principles, Methods, and Problems
No ratings yet
Analysing Verbal Data: Principles, Methods, and Problems
11 pages
Hagoort 2017
No ratings yet
Hagoort 2017
29 pages
A Dynamic Polysemy Approach To The Lexical Semantics of Discourse Markers
No ratings yet
A Dynamic Polysemy Approach To The Lexical Semantics of Discourse Markers
47 pages
1997 D Towards A Lexcial Processing Model For The Study of Second Language Vocabulary Acquisition T. Sima Paribakht and M. Wesche2
No ratings yet
1997 D Towards A Lexcial Processing Model For The Study of Second Language Vocabulary Acquisition T. Sima Paribakht and M. Wesche2
22 pages
TEMA 4 .A y 4.b La Competencia Léxica y Gramatical
No ratings yet
TEMA 4 .A y 4.b La Competencia Léxica y Gramatical
15 pages
Chapter 9
No ratings yet
Chapter 9
16 pages
Semantics (R-E Anul Iv) : Lexical Relations
No ratings yet
Semantics (R-E Anul Iv) : Lexical Relations
44 pages
WannerLeo 1996 LexicalFunctionsATool LexicalFunctionsInLex
No ratings yet
WannerLeo 1996 LexicalFunctionsATool LexicalFunctionsInLex
66 pages
Speech Recognition
No ratings yet
Speech Recognition
10 pages
Borer 2009 Roots - and - Categories
No ratings yet
Borer 2009 Roots - and - Categories
23 pages
Lexical Encoding of Verbs in English and Bulgarian Rositsa Dekova
No ratings yet
Lexical Encoding of Verbs in English and Bulgarian Rositsa Dekova
8 pages
Words and Phrases Corpus Studies of Lexical Semantics 1st Edition Michael Stubbs Latest PDF 2025
No ratings yet
Words and Phrases Corpus Studies of Lexical Semantics 1st Edition Michael Stubbs Latest PDF 2025
156 pages
Anchas' Mid Psycho
No ratings yet
Anchas' Mid Psycho
5 pages
Aki.J Kyröläinen - Kristina Geeraert - The Relationship Between Form and Meaning Modelling Semantic Densities of English
No ratings yet
Aki.J Kyröläinen - Kristina Geeraert - The Relationship Between Form and Meaning Modelling Semantic Densities of English
4 pages
LEX Přednášky
No ratings yet
LEX Přednášky
27 pages
Semantic Concepts and Linguistic Theories
No ratings yet
Semantic Concepts and Linguistic Theories
4 pages
The Lexical Approach
No ratings yet
The Lexical Approach
8 pages
Jasmina Milicevic
100% (1)
Jasmina Milicevic
17 pages
Maggioli Language
No ratings yet
Maggioli Language
23 pages
Haselow, A. (2017) - Spontaneous Spoken English Introduction
No ratings yet
Haselow, A. (2017) - Spontaneous Spoken English Introduction
42 pages
Bottleneck Hypothesis
No ratings yet
Bottleneck Hypothesis
29 pages
Lexical Representation A Multidisciplinary Approach 1st Edition Gareth Gaskell PDF Download
No ratings yet
Lexical Representation A Multidisciplinary Approach 1st Edition Gareth Gaskell PDF Download
52 pages
Theory of Reading Comprehension (Pilar Nunez Delgado)
No ratings yet
Theory of Reading Comprehension (Pilar Nunez Delgado)
71 pages
Modeling Word Interpretation With Deep Language Models: The Interaction Between Expectations and Lexical Information
No ratings yet
Modeling Word Interpretation With Deep Language Models: The Interaction Between Expectations and Lexical Information
7 pages
El Lexicón Generativo James Pustejovsky
No ratings yet
El Lexicón Generativo James Pustejovsky
34 pages
Structural Ambiguity and Lexical Relations: Computational Linguistics May 2002
No ratings yet
Structural Ambiguity and Lexical Relations: Computational Linguistics May 2002
19 pages
A Theory of Lexical Access in Speech Production: Willem J. M. Levelt
No ratings yet
A Theory of Lexical Access in Speech Production: Willem J. M. Levelt
25 pages
English Lexicology I
No ratings yet
English Lexicology I
114 pages
EXAM 2022 Cognitive Science Linguistics
No ratings yet
EXAM 2022 Cognitive Science Linguistics
8 pages
Pav LK 2018 English Lexi Cology I I
No ratings yet
Pav LK 2018 English Lexi Cology I I
119 pages
Lemke - Analyzing Verbal Data Principles, Methods, and Problems - 2012
No ratings yet
Lemke - Analyzing Verbal Data Principles, Methods, and Problems - 2012
14 pages
Lexicon Core and Its Functioning: Sciencedirect
No ratings yet
Lexicon Core and Its Functioning: Sciencedirect
5 pages
Semantics: Ali Kaan Akgün, 27.10.2022, Semantics
No ratings yet
Semantics: Ali Kaan Akgün, 27.10.2022, Semantics
4 pages
Speech and Language Processing
No ratings yet
Speech and Language Processing
26 pages
5 - Prednáška
No ratings yet
5 - Prednáška
7 pages
Morfosintassi Inglese Piotti PDF
No ratings yet
Morfosintassi Inglese Piotti PDF
55 pages
Summary of Cruse
No ratings yet
Summary of Cruse
24 pages
000 Euralex 2010 04 Plenary BOGAARDS Dictionaries and Second Language Acquisition
No ratings yet
000 Euralex 2010 04 Plenary BOGAARDS Dictionaries and Second Language Acquisition
25 pages
Words and Phrases Corpus Studies of Lexical Semantics 1st Edition Michael Stubbs Updated 2025
No ratings yet
Words and Phrases Corpus Studies of Lexical Semantics 1st Edition Michael Stubbs Updated 2025
156 pages
Harley 2012-Semantics in DM
No ratings yet
Harley 2012-Semantics in DM
36 pages
Chapter 4
No ratings yet
Chapter 4
98 pages
Information Processing Stages
No ratings yet
Information Processing Stages
12 pages
Reisberg, CH 4
No ratings yet
Reisberg, CH 4
20 pages
CP - 9 - BOOK Michael W Eysenck - Mark T Keane-Cognitive Psychology - A Students Handbook (2010)
No ratings yet
CP - 9 - BOOK Michael W Eysenck - Mark T Keane-Cognitive Psychology - A Students Handbook (2010)
42 pages
Cognitive Psychology SAQ Essay Question
No ratings yet
Cognitive Psychology SAQ Essay Question
5 pages
Unit 1,2,3 CP Notes
No ratings yet
Unit 1,2,3 CP Notes
50 pages

[email protected] - 44

Uploaded by

[email protected] - 44

Uploaded by

2 Theory Visualizations for Bilingual Models

of Lexical Ambiguity Resolution

Ben Falandays and Michael J. Spivey

in bilinguals (e.g., Altarriba & Gianico, 2003), it may be possible to better

Figure 2.1 Theory visualizations of lexical ﬁelds in linguistic state space:

In addition to ambiguous words and polysemous words, another form

more semantic dimensions

encounter some difﬁculty understanding the rest of Sentence (2.3). By

Figure 2.3 Contextual diversity of lexical ﬁelds: (a) one region of

word piglet has a much more evenly distributed pattern of occurrences

Parallel Distributed Processing Models of Word

activation model (IAM) (McClelland & Rumelhart, 1981). Therefore, to

when the input is bus; Toscano, Anderson, & McMurray, 2013).

Lexical Ambiguity Resolution in PDP Models

Dominant Visual Feature Acoustic Feature

Visual Word-Form Input Speech Input

Figure 2.4 (a) McClelland and Rumelhart’s (1981) interactive activation

ambiguous words (words having one sense that is more common or

other. By contrast, the equi-biased ambiguous words have two attractor

Bilingual Interactive Activation

experimental evidence indicates that, for bilingual speakers, both ortho-

phono and semantic dimensions

For example, the English word shark is an interlingual cohort of sharik

pattern over time) will be somewhat nonlinear as it curves slightly toward

landscape, which determines the degree of attraction toward various

processing, Code-switching, Cognates, Cohort model, Connectionist

Science Society Conference (pp. 368–373). Hillsdale, NJ: Lawrence Erlbaum

Schwartz, A. I., & Kroll, J. F. (2006). Bilingual lexical activation in sentence

You might also like