0% found this document useful (0 votes)

24 views35 pages

Lecture 07

Uploaded by

1162407364

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views35 pages

Lecture 07

Uploaded by

1162407364

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Natural Language

Processing
Lecture 7: Parsing with Context Free Grammars II.
CKY for PCFGs. Earley Parser.

11/13/2020

COMS W4705
Yassine Benajiba
Recall: Syntactic Ambiguity
S → NP VP NP → she
VP → V NP NP → glasses
VP → VP PP D → the
PP → P NP N → cat
NP →DN N → glasses
NP → NP PP V → saw
S S
P → with
VP[1,6] VP[1,6]

VP[2,4] NP[2,6]

NP PP[4,6] NP PP

NP V D N P NP NP V[1,2] D N P NP
she saw the cat with glasses she saw the cat with glasses

Which parse tree is “better”? More probable?

Probabilities for Parse Trees
• Let be the set of all parse trees generated by
grammar G.

• We want a model that assigns a probability to each parse

tree, such that .

• We can use this model to select the most probable parse

tree compatible with an input sentence.

• This is another example of a generative model!

Selecting Parse Trees
• Let be the set of trees generated by grammar G whose
yield (sequence of leafs) is string s.

• The most likely parse tree produced by G for string s is

• How do we define P(t)?

• How do we learn such a model from training data (annotated or

un-annotated).

• How do we find the highest probability tree for a given

sentence? (parsing/decoding)
Probabilistic Context Free
Grammars (PCFG)
• A PCFG consists of a Context Free Grammar
G=(N, Σ, R, S) and a probability P(A → β) for each
production A → β ∈ R.

• The probabilities for all rules with the same left-hand-

side sum up to 1:

• Think of this as the conditional probability for A → β,

given the left-hand-side nonterminal A.
PCFG Example
S → NP VP [1.0] NP → she [0.05]
VP → V NP [0.6] NP → glasses [0.05]
VP → VP PP [0.4] D → the [1.0]
PP → P NP [1.0] N → cat [0.3]
NP →DN [0.7] N → glasses [0.7]
NP → NP PP [0.2] V → saw [1.0]
P → with [1.0]
Parse Tree Probability
• Given a parse tree , containing rules
the probability of t is

S → NP PP 1.0

VP → VP PP .4

VP → V NP
.6
NP → D N PP →P NP
.7 1.0

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05

1 x .05 x .4 x .6 x 1 x 0.7 x 1 x 0.3 x 1 x 1 x .05 = .000126

Parse Tree Probability
• Given a parse tree , containing rules
the probability of t is

S → NP PP 1.0

VP → V NP .6

NP → NP PP
.2

NP → D N PP →P NP
.7 1.0

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05

1 x .05 x .6 x 1 x .2 x .7 x 1 x .3 x 1 x 1 x .05 = 0.000063 < 0.000126

Estimating PCFG
probabilities
• Supervised training: We can estimate PCFG probabilities from a
treebank, a corpus manually annotated with constituency
structure using maximum likelihood estimates:

• Unsupervised training:

• What if we have a grammar and a corpus, but no annotated

parses?

• Can use the inside-outside algorithm for parsing and do EM

estimation of the probabilities (not discussed in this course)
The Penn Treebank
• Syntactically annotated corpus of newspaper text (1989
Wall Street Journal Articles).
• The source text is naturally occurring but the treebank is
not:
• Assumes a specific linguistic theory (although a simple
one).
• Very flat structure (NPs, Ss, VPs).
PTB Example
( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken))
(, ,)
(ADJP (NML (CD 61) (NNS years))
(JJ old))
(, ,))
(VP (MD will)
(VP (VB join)
(NP (DT the) (NN board))
(PP-CLR (IN as)
(NP (DT a) (JJ nonexecutive) (NN director)))
(NP-TMP (NNP Nov.) (CD 29))))
(. .)))
PTB Example
Parsing with PCFG
• We want to use PCFG to answer the following questions:

• What is the total probability of the sentence under the

PCFG?

• What is the most probable parse tree for a sentence

under the PCFG? (decoding/parsing)

• We can modify the CKY algorithm.

Basic idea: Compute these probabilities bottom-up using
dynamic programming.
Computing Probabilities
Bottom-Up

S → NP PP .05 x .00126 x 1 = 0.000063

VP → V NP 1 x .0021 x .6 = .00126

NP → NP PP .21 x x .05 x .2 = .0021

NP → D N PP →P NP
1 x .3 x .7 = .21 1 x 1 x .05 = .05

D→ the N→cat
NP→ she V→ saw saw cat P→with NP→ glasses
.05 1.0 1.0 .3 1.0 .05
CKY for PCFG Parsing
• Let be the set of trees generated by grammar G
starting at nonterminal A, whose yield is string s

• Use a chart π so that π[i,j,A] contains the probability of the highest

probability parse tree for string s[i,j] starting in nonterminal A.

• We want to find π[0,lenght(s),S] -- the probability of the highest-

scoring parse tree for s rooted in the start symbol S.
CKY for PCFG Parsing
• To compute π[0,lenght(s),S] we can use the following recursive
definition:

Base case:

• Then fill the chart using dynamic programming.

CKY for PCFG Parsing
• Input: PCFG G=(N, Σ, R, S), input string s of length n.

• for i=0…n-1: initialization

• for length=2…n: main loop

for i=0…(n-length):
j = i+length
for k=i+1…j-1:
for A ∈ N:

Use backpointers to retrieve the highest-scoring parse tree (see previous lecture).
Probability of a Sentence

• What if we are interested in the probability of a sentence,

not of a single parse tree (for example, because we want
to use the PCFG as a language model).

• Problem: Spurious ambiguity. Need to sum the

probabilities of all parse trees for the sentence.

• How do we have to change CKY to compute this?

Earley Parser
• CKY parser starts with words and builds parse trees bottom-
up; requires the grammar to be in CNF.

• The Earley parser instead starts at the start symbol and tries
to “guess” derivations top-down.

• It discards derivations that are incompatible with the

sentence.

• The early parser sweeps through the sentence left-to-right

only once. It keeps partial derivations in a table (“chart”).

• Allows arbitrary CFGs, no limitation to CNF.

Parser States
• Earley parser keeps track of partial derivations using parser
states / items.

• State represent hypotheses about constituent structure based

on the grammar, taking into account the input.

• Parser states are represented as dotted rules with spans.

• The constituents to the left of the · have already been seen
in the input string s (corresponding to the span)

S → · NP VP [0,0] “According to the grammar, there may be an NP

starting in position 0. “

NP → D A · N [0,2] "There is a determiner followed by an adjective in s[0,2]“

NP → NP PP · [3,8] "There is a complete NP in s[3,8], consisting of an NP and PP”

Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → · the [0,0]
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1]
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → · D N [0,0]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] passive state
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D · N [0,1]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] passive state
NP → NP PP N → student
Three parser operations:
1. Predict new subtrees top-down.

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D · N [0,1]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → · student [1,1]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → · NP VP [0,0]
VP → VP PP D → the
NP → · NP PP [0,0] NP → D N · [0,2]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → student · [1,2]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Parser (sketch)
S → NP VP V → saw
VP → V NP P → with S → NP · VP [0,2]
VP → VP PP D → the
NP → NP · PP [0,2] NP → D N · [0,2]
PP → P NP N → cat
NP →DN N → tail D → the · [0,1] N → · cat [1,1]
NP → NP PP N → student
N → · tail [1,1]
Three parser operations:
1. Predict new subtrees top-down. N → student · [1,2]

2. Scan input terminals.

3. Complete with passive states.

the student saw the cat with the tail

0 1 2 3 4 5 6 7
Earley Algorithm
• Keep track of parser states in a table (“chart”). Chart[k]
contains a set of all parser states that end in position k.
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• Initialization: For each production S→α ∈R

add a state S →·α[0,0] to Chart[0].

• for i = 0 to n:
• for each state in Chart[i]:
• if state is of form A →α ·s[i] β [k,i]:
scan(state)
• elif state is of form A →α ·B β [k,i]:
predict(state)

• elif state is of form A →α · [k,i]

complete(state)
Earley Algorithm
• Keep track of parser states in a table (“chart”). Chart[k]
contains a set of all parser states that end in position k.
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• Initialization: For each production S→α ∈R

add a state S →·α[0,0] to Chart[0].

• for i = 0 to n:
• for each state in Chart[i]:
• if state is of form A →α ·s[i] β [k,i]:
scan(state) else then is states of form
A →α · β [k,i], i.e.
• elif state is of form A →α ·B β [k,i]:
predict(state) β is not s[i], in which case we
don’t want to do anything
• elif state is of form A →α · [k,i]
complete(state)
Earley Algorithm - Scan
• The scan operation can only be applied to a state if the dot is
in front of a terminal symbol that matches the next input
terminal.

• function scan(state): // state is of form A →α ·s[i] β [k,i]

• Add a new state A →α s[i]·β [k,i+1]

to Chart[i+1]
Earley Algorithm - Predict
• The predict operation can only be applied to a state if the dot is
in front of a non-terminal symbol.

• function predict(state): // state is of form A →α ·B β [k,i]:

• Add a new state B →· γ [i,i]

to Chart[i]

• Note that this modifies Chart[i] while the algorithm is looping

through it.

• No duplicate states are added (Chart[i] is a set)

Earley Algorithm - Complete
• The complete operation may only be applied to a passive item.

• function complete(state): // state is of form A →α · [k,j]

• for each state B → β ·A γ [i,k] add a new state

B → β A · γ[i,j] to Chart[j]

• Note that this modifies Chart[i] while the algorithm is looping

through it.

• Note that it is important to make a copy of the old state

before moving the dot.
• This operation is similar to the combination operation in CKY!
Earley Algorithm - Runtime
• The runtime depends on the number of items in the chart
(each item is “visited” exactly once).

• We proceed through the input exactly once, which takes

O(N).

• For each position on the chart, there are O(N) possible split
points where the dot could be.

• Each complete operation can produce O(N) possible new

items (with different starting points).

• Total: O(N3)
Earley Algorithm -
Some Observations
• How do we recover parse trees?

• What happens in case of ambiguity?

• Multiple ways to Complete the same state.

• Keep back-pointers in the parser state objects.

• Or use a separate data structure (CKY-style table or

hashed states)

• How do we make the algorithm work with PCFG?

• Easy to compute probabilities on Complete. Follow back pointer with

max probability.

NLP Unit-Iii
No ratings yet
NLP Unit-Iii
26 pages
General Principles of Machine Tool Design
100% (1)
General Principles of Machine Tool Design
16 pages
Unit 3
No ratings yet
Unit 3
19 pages
Parental Personality and Parenting Style
No ratings yet
Parental Personality and Parenting Style
13 pages
Natural Language Processing UNIT 2
No ratings yet
Natural Language Processing UNIT 2
32 pages
Probabilistic Context-Free Grammar
No ratings yet
Probabilistic Context-Free Grammar
13 pages
API 577 Closed Books Questions Answers
100% (6)
API 577 Closed Books Questions Answers
3 pages
NLP Unit 3
No ratings yet
NLP Unit 3
17 pages
Free Booklet On Lettering Basics, by Mark Van Leeuwen
No ratings yet
Free Booklet On Lettering Basics, by Mark Van Leeuwen
9 pages
Constituency Parsing PPT 2
No ratings yet
Constituency Parsing PPT 2
33 pages
NLP Sem 3 Unit
No ratings yet
NLP Sem 3 Unit
12 pages
NLP Unit 3 (Part 1)
No ratings yet
NLP Unit 3 (Part 1)
7 pages
Advanced NLP: CFG Parsing Guide
No ratings yet
Advanced NLP: CFG Parsing Guide
28 pages
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
100% (2)
Predictive Parsing and LL (1) - Compiler Design - Dr. D. P. Sharma - NITK Surathkal by Wahid311
56 pages
18-Predictive Parsing
No ratings yet
18-Predictive Parsing
152 pages
Basic Parsing Techniques - Parsing
No ratings yet
Basic Parsing Techniques - Parsing
20 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Parsing Techniques for NLP Students
No ratings yet
Parsing Techniques for NLP Students
60 pages
Week 3 - Probablistic Context Free Grammars
No ratings yet
Week 3 - Probablistic Context Free Grammars
18 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
5B07 PHY Electrostatics and Magnetostatics
No ratings yet
5B07 PHY Electrostatics and Magnetostatics
3 pages
Engineering Design for Rebar Installation
No ratings yet
Engineering Design for Rebar Installation
1 page
Chapter 9 V 2
No ratings yet
Chapter 9 V 2
18 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
Pinto - pm2 - Session 4 - Shared Slides
No ratings yet
Pinto - pm2 - Session 4 - Shared Slides
78 pages
NLPPR6
No ratings yet
NLPPR6
6 pages
Early Parser
No ratings yet
Early Parser
4 pages
NLP 3
No ratings yet
NLP 3
4 pages
Activity 1.6
No ratings yet
Activity 1.6
3 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
CH 08
No ratings yet
CH 08
31 pages
Unit 3
No ratings yet
Unit 3
4 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
SCFG PCFG LCFG
No ratings yet
SCFG PCFG LCFG
25 pages
CS6120 35650 - Spring2025 - Assignment - 2-1
No ratings yet
CS6120 35650 - Spring2025 - Assignment - 2-1
5 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
Slp14 Handout s17hw
No ratings yet
Slp14 Handout s17hw
71 pages
NLP Unit-4
No ratings yet
NLP Unit-4
6 pages
PCFGs for Linguistics Students
No ratings yet
PCFGs for Linguistics Students
79 pages
NLP Parsing Techniques
No ratings yet
NLP Parsing Techniques
54 pages
NLP Unit-2
No ratings yet
NLP Unit-2
18 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
No ratings yet
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
21 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
Notes 4
No ratings yet
Notes 4
7 pages
Scientific Aspects of Juggling by Claude Shannon
No ratings yet
Scientific Aspects of Juggling by Claude Shannon
11 pages
CVVT (Continuously Variable Valve Timing) System: Description
No ratings yet
CVVT (Continuously Variable Valve Timing) System: Description
3 pages
Efficient Earley Parsing With Regular Right-Hand Sides
No ratings yet
Efficient Earley Parsing With Regular Right-Hand Sides
14 pages
14 Syntax 1
No ratings yet
14 Syntax 1
22 pages
4.chapter5 - Syntactic and Semantic Representations
No ratings yet
4.chapter5 - Syntactic and Semantic Representations
47 pages
Phy Pract Mock
No ratings yet
Phy Pract Mock
9 pages
A Look at Parsing and Its Applications
No ratings yet
A Look at Parsing and Its Applications
5 pages
Basic Parsing Techniques
No ratings yet
Basic Parsing Techniques
9 pages
NLP Parsing Techniques Explained
No ratings yet
NLP Parsing Techniques Explained
11 pages
WSM Vs ULM Vs LSM
No ratings yet
WSM Vs ULM Vs LSM
3 pages
CFG & PCFG
No ratings yet
CFG & PCFG
15 pages
The Expectation Maximization (EM) Algorithm: Continued!
No ratings yet
The Expectation Maximization (EM) Algorithm: Continued!
67 pages
Parsing and Ambiguity in NLP
No ratings yet
Parsing and Ambiguity in NLP
18 pages
PCFG
No ratings yet
PCFG
79 pages
Chart Parsers PDF
No ratings yet
Chart Parsers PDF
7 pages
Context-Free Grammars and Parsing
No ratings yet
Context-Free Grammars and Parsing
7 pages
Xu-Ly-Ngon-Ngu-Tu-Nhien - Kai-Wei-Chang - 16-Cky - (Cuuduongthancong - Com)
No ratings yet
Xu-Ly-Ngon-Ngu-Tu-Nhien - Kai-Wei-Chang - 16-Cky - (Cuuduongthancong - Com)
61 pages
6 Probabilisticparse
No ratings yet
6 Probabilisticparse
46 pages
14 Ai Cse551 NLP 2 PDF
No ratings yet
14 Ai Cse551 NLP 2 PDF
39 pages
Thuật toán NLP
No ratings yet
Thuật toán NLP
57 pages
AI Unit 4
No ratings yet
AI Unit 4
11 pages
Formal Languages, Automata and Computability
No ratings yet
Formal Languages, Automata and Computability
29 pages
Jamaican Ska Music Evolution
No ratings yet
Jamaican Ska Music Evolution
4 pages
External Reciprocating Steam Engine
No ratings yet
External Reciprocating Steam Engine
8 pages
Unit-1 - Introduction To Nodejs
No ratings yet
Unit-1 - Introduction To Nodejs
92 pages
AutoCAD Customization Projects
No ratings yet
AutoCAD Customization Projects
6 pages
JDBC
No ratings yet
JDBC
8 pages
Semiring Parsing
No ratings yet
Semiring Parsing
34 pages
Inducing Tree-Substitution Grammars: Trevor Cohn
No ratings yet
Inducing Tree-Substitution Grammars: Trevor Cohn
44 pages
A.S Level Biology Edexcel Notes Unit 1 Part 1 Color 2side
No ratings yet
A.S Level Biology Edexcel Notes Unit 1 Part 1 Color 2side
134 pages
CS311 Final Term Question File 2019, 2020, 2021
No ratings yet
CS311 Final Term Question File 2019, 2020, 2021
5 pages
Data Mining
No ratings yet
Data Mining
32 pages
BIOS Instructor Setup Rev 6 65
No ratings yet
BIOS Instructor Setup Rev 6 65
24 pages
University Semester Practical Exam Schedule NOv-Dec 2024 - 3 - 5 - Semester
No ratings yet
University Semester Practical Exam Schedule NOv-Dec 2024 - 3 - 5 - Semester
6 pages
CC218 Lec1 DiscreteMath Logic of Compound Stat
No ratings yet
CC218 Lec1 DiscreteMath Logic of Compound Stat
7 pages
Grade 7 Science: Heat & Energy
No ratings yet
Grade 7 Science: Heat & Energy
9 pages
Reviewer
No ratings yet
Reviewer
5 pages
Gauss Legendre Quadrature Method
No ratings yet
Gauss Legendre Quadrature Method
7 pages
Math - Exercise of Pat
No ratings yet
Math - Exercise of Pat
5 pages
Hyd Cylinder Details Jyo Make
No ratings yet
Hyd Cylinder Details Jyo Make
4 pages
Volvo Penta Air Heater Guide
No ratings yet
Volvo Penta Air Heater Guide
2 pages