Syntax and Parsing
6405
2018/19—Sem I
Morphological Analysis Syntactic Analysis
Phases of Text
Generation: Words Phrases Sentences
ֳשׂበُ
የወץቅ ֳשׂበُ
ָُቅ የወץቅ ֳשׂበُ
ያ ָُቅ የወץቅ ֳשׂበُ
ጠጅ
የ ץדጠጅ
ንፁֱ የ ץדጠጅ
ሁֳُ ֵُ ץንፁֱ የ ץדጠጅ
ደግ
በጣ וደግ
ፈע
רው ፈע
እንደ ወንድר בው ፈע
በጣ וእንደ ወንድר בው ፈע
רጠ٤ው
ֳካሳ אፅሀፍ רጠ٤ው
ָኮֶָٍ
ֳአስَ ץገንዘብ ָኮֶָٍ
በוֹንክ ֳአስَ ץገንዘብ ָኮֶָٍ
ወደ ቤُ
ከወንድ בጋץ
ክፉኛ
እንደ ወንድ בክፉኛ
በጣ וእንደ ወንድ בክፉኛ
አስَ ץወደገበያ ְደ٤
ካሳ አስَ ץቤُ እንደףר٤ ָّור
The computer is on the table SS
The computer NP is on the table VP
The Det computer N is V on the table PP
on P the table NP
the Det table N
NP
Det A (Modifier) NP
N (Comp.) N (Head)
NP
NP A NP
Det N N (Comp.) N (Head)
AP
Det A (HEAD)
AP
Det PP AP
P N N A (Head)
VP
PP N (Comp.) V (Head)
P N
VP
PP (Modifier) VP
P N PP N (Comp.) V (Head)
P N
PP
P (Head) N
PP
P PP
N P (Head)
AdvP
Det PP Adv (Head)
P N
SS
N VP
PP V
P N
CS
N VP
SS V
N VP
N V
Parsing - is a derivation process which identifies the structure of sentences using a given
grammar.
- considered as a special case of a search problem.
- two basic methods of searching are used
top-down strategy
bottom-up strategy
- methods of improving efficiency
storing lexical rules separately
chunking
Given the following English grammar. Then, the sentence Abebe killed the lion can
be parsed using top-down strategy as follows.
S → NP VP
VP → V NP S ⇒ NP VP [rewriting S]
NP → NAME
NP → DET N ⇒ NAME VP [rewriting NP]
NAME → Abebe ⇒ Abebe VP [rewriting NAME]
V → killed
⇒ Abebe V NP [rewriting VP]
DET → the
N → lion ⇒ Abebe killed NP [rewriting V]
⇒ Abebe killed DET N [rewriting NP]
⇒ Abebe killed the N [rewriting DET]
⇒ Abebe killed the lion [rewriting N]
Given the following English grammar. Then, the sentence Abebe killed the lion can
be parsed using bottom-up strategy as follows.
S → NP VP
VP → V NP Abebe killed the lion
NP → NAME
NAME killed the lion [rewriting Abebe]
NP → DET N
NAME → Abebe NAME V the lion [rewriting killed]
V → killed
DET → the NAME V DET lion [rewriting the]
N → lion NAME V DET N [rewriting lion]
NP V DET N [rewriting NAME]
NP V NP [rewriting DET N]
NP VP [rewriting V NP]
S [rewriting NP VP]
The efficiency of parsing algorithms can be improved if lexical rules are stored separately in a
structure called lexicon, which specifies the possible categories for each word.
The following example shows the lexical rules separated from other grammatical rules.
Grammatical Rules Grammatical (without lexical rules)
S → NP VP S → NP VP
VP → V NP VP → V NP
NP → NAME NP → NAME
NP → DET N NP → DET N
NAME → Abebe
V → killed
V → fly
Lexical Rules
DET → the
N → lion
N → fly Abebe: NAME
killed: V
the: DET
lion: N
fly: V, N
Chunking, also called partial parsing, is a technique which attempts to model human parsing
by breaking the text up into small pieces, each parsed separately. Chunk boundaries
correspond roughly to the pauses in everyday speech.
For example, consider the following sentence.
When I read a sentence, I read it a chunk at a time.
Then, the following chunks can be identified.
[When I read] [a sentence], [I read it] [a chunk] [at a time].
Each chunk can then be parsed separately. In addition to perhaps being a better model of
human behavior than full parsing methods, other advantages of chunk parsing are as
follows:
• Because a chunk parser only needs to deal with small, non-recursive clauses, it is able
to process text much more quickly.
• A chunk parser is easier to implement and requires much less memory to parse.
• When a full parse fails, it must discard an entire sentence, even if it got much of the
structure correct. A chunk parser only discards a few words when it cannot figure out
how to proceed.