0% found this document useful (0 votes)

36 views33 pages

Constituency Parsing PPT 2

The document provides an overview of constituency parsing in natural language processing, covering key concepts such as context-free grammars (CFGs), probabilistic context-free grammars (PCFGs), and the CKY algorithm. It discusses the importance of syntactic structure for understanding language, the challenges of ambiguity in parsing, and the evaluation metrics for parsing accuracy. Additionally, it highlights the limitations of PCFGs and introduces lexicalized PCFGs as a solution to improve parsing performance.

Uploaded by

YASWANTH P 717822I163

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views33 pages

Constituency Parsing PPT 2

Uploaded by

YASWANTH P 717822I163

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

COS 484: Natural Language Processing

Constituency Parsing

Fall 2019

(Some slides adapted from Chris Manning, Mike Collins)

Overview

• Constituency structure vs dependency structure

• Context-free grammar (CFG)
• Probabilistic context-free grammar (PCFG)
• The CKY algorithm
• Evaluation
• Lexicalized PCFGs
Syntactic structure: constituency and dependency

Two views of linguistic structure

• Constituency
• = phrase structure grammar
• = context-free grammars (CFGs)
• Dependency
Constituency structure
• Phrase structure organizes words into nested constituents

• Starting units: words are given a category: part-of-speech tags

the, cuddly, cat, by, the, door

Det, Adj, N, P, Det, N

• Words combine into phrases with categories

the cuddly cat, by the door
NP→Det Adj N PP→P NP

• Phrases can combine into bigger phrases recursively

the cuddly cat by the door
NP→ NP PP
This Thursday
Dependency structure
• Dependency structure shows which words depend on (modify or
are arguments of) which other words.

nmod

nsubj dobj case

Satellites spot whales from space

❌
Why do we need sentence structure?

• We need to understand sentence structure in order to be able to

interpret language correctly

• Human communicate complex ideas by composing words together

into bigger units

• We need to know what is connected to what

Syntactic parsing
• Syntactic parsing is the task of recognizing a sentence and
assigning a structure to it.

Input: Output:

Beoing is located in Seattle.

Syntactic parsing
• Used as intermediate representation for downstream applications
English word order: subject — verb — object
Japanese word order: subject — object — verb

Image credit: http://vas3k.com/blog/machine_translation/

Syntactic parsing
• Used as intermediate representation for downstream applications

Image credit: (Zhang et al, 2018)

Context-free grammars

• The most widely used formal system for modeling

constituency structure in English and other natural languages

• A context free grammar G = (N, Σ, R, S) where

• N is a set of non-terminal symbols
• Σ is a set of terminal symbols
• R is a set of rules of the form X → Y1Y2…Yn for n ≥ 1,
X ∈ N, Yi ∈ (N ∪ Σ)
• S ∈ N is a distinguished start symbol
A Context-Free Grammar for English

Grammar Lexicon

S:sentence, VP:verb phrase, NP: noun phrase, PP:prepositional phrase,

DT:determiner, Vi:intransitive verb, Vt:transitive verb, NN: noun, IN:preposition
(Left-most) Derivations

• Given a CFG G, a left-most derivation is a sequence of strings

s1, s2, …, sn, where

• s1 = S
• sn ∈ Σ*: all possible strings made up of words from Σ
• Each si for i = 2,…, n is derived from si−1 by picking the left-most
non-terminal X in si−1 and replacing it by some β where X → β ∈ R

• sn: yield of the derivation

(Left-most) Derivations
• s1 = S
• s2 = NP VP
• s3 = DT NN VP
• s4 = the NN VP
• s5 = the man VP
• s6 = the man Vi
• s7 = the man sleeps
A derivation can be represented as a parse tree!

• A string s ∈ Σ* is in the language defined by the CFG if

there is at least one derivation whose yield is s

• The set of possible derivations may be finite or infinite

Ambiguity
• Some strings may have more than one derivations (i.e. more
than one parse trees!).
“Classical” NLP Parsing
• In fact, sentences can have a very large number of possible parses

The board approved [its acquisition] [by Royal Trustco Ltd.] [of
Toronto] [for $27 a share] [at its monthly meeting].

((ab)c)d (a(bc))d (ab)(cd) a((bc)d) a(b(cd))

1
n+1 (n)
Catalan number: Cn = 2n

• It is also difficult to construct a grammar with enough coverage

• A less constrained grammar can parse more sentences but
result in more parses for even simple sentences
• There is no way to choose the right parse!
Statistical parsing
• Learning from data: treebanks

• Adding probabilities to the rules: probabilistic CFGs (PCFGs)

Treebanks: a collection of sentences paired with their parse trees

The Penn Treebank Project (Marcus et al, 1993)

Treebanks
• Standard setup (WSJ portion of Penn Treebank):
• 40,000 sentences for training
• 1,700 for development
• 2,400 for testing

• Why building a treebank instead of a grammar?

• Broad coverage
• Frequencies and distributional information

• A way to evaluate systems

Probabilistic context-free grammars (PCFGs)

• A probabilistic context-free grammar (PCFG) consists of:

• A context-free grammar: G = (N, Σ, R, S)

• For each rule α → β ∈ R, there is a parameter q(α → β) ≥ 0.

For any X ∈ N,

∑
q(α → β) = 1
α→β:α=X
Probabilistic context-free grammars (PCFGs)
For any derivation (parse tree) containing rules:
α1 → β1, α2 → β2, …, αl → βl, the probability of the parse is:
l

∏
q(αi → βi)
i=1

P(t) = q(S → NP VP) × q(NP → DT NN) × q(DT → the)

× q(NN → man) × q(VP → Vi) × q(Vi → sleeps)

= 1.0 × 0.3 × 1.0 × 0.7 × 0.4 × 1.0 = 0.084

∑
Why do we want q(α → β) = 1?
α→β:α=X
Deriving a PCFG from a treebank
• Training data: a set of parse trees t1, t2, …, tm

• A PCFG (N, Σ, S, R, q):

• N is the set of all non-terminals seen in the trees
• Σ is the set of all words seen in the trees
• S is taken to be S.
• R is taken to be the set of all rules α → β seen in the trees
• The maximum-likelihood parameter estimates are:
Count(α → β)
qML(α → β) =
Count(α)

If we have seen the rule VP → Vt NP 105 times, and the the non-terminal
VP 1000 times, q(VP → Vt NP) = 0.105
Parsing with PCFGs
• Given a sentence s and a PCFG, how to find the highest scoring
parse tree for s?
argmaxt∈𝒯(s)P(t)

• The CKY algorithm: applies to a PCFG in Chomsky normal

form (CNF)

• Chomsky Normal Form (CNF): all the rules take one

of the two following forms:

• X → Y1Y2 where X ∈ N, Y1 ∈ N, Y2 ∈ N
• X → Y where X ∈ N, Y ∈ Σ
• It is possible to convert any PCFG into an equivalent grammar in CNF!
• However, the trees will look differently; It is possible to do “reverse
transformation”
Converting PCFGs into a CNF grammar
• n-ary rules (n > 2): NP → DT NNP VBG NN

• Unary rules: VP → Vi, Vi → sleeps

• Eliminate all the unary rules recursively by adding VP → sleeps

• We will come back to this later!

The CKY algorithm

• Dynamic programming

• Given a sentence x1, x2, …, xn, denote π(i, j, X) as the highest score
for any parse tree that dominates words xi, …, xj and has non-
terminal X ∈ N as its root.

• Output: π(1,n, S)

• Initially, for i = 1,2,…, n,

{0
q(X → xi) if X → xi ∈ R
π(i, i, X) =
otherwise
The CKY algorithm
• For all (i, j) such that 1 ≤ i < j ≤ n for all X ∈ N,

π(i, j, X) = max q(X → YZ) × π(i, k, Y ) × π(k + 1,j, Z)

X→YZ∈R,i≤k<j

Also stores backpointers which allow us to recover the parse tree

The CKY algorithm

Running time?
O(n 3 | R | )
CKY with unary rules
• In practice, we also allow unary rules:

X → Y where X, Y ∈ N
conversion to/from the normal form is easier

How does this change CKY?

π(i, j, X) = max q(X → Y ) × π(i, j, Y )
X→Y∈R

• Compute unary closure: if there is a rule chain

X → Y1, Y1 → Y2, …, Yk → Y, add
q(X → Y ) = q(X → Y1) × ⋯ × q(Yk → Y )

• Update unary rule once after the binary rules

Evaluating constituency parsing
Evaluating constituency parsing

• Recall: (# correct constituents in candidate) / (# constituents in

gold tree)
• Precision: (# correct constituents in candidate) / (# constituents in
candidate)
• Labeled precision/recall require getting the non-terminal label
correct
• F1 = (2 * precision * recall) / (precision + recall)
Evaluating constituency parsing

• Precision: 3/7 = 42.9%

• Recall: 3/8 = 37.5%
• F1 = 40.0%
• Tagging accuracy: 100%
Weaknesses of PCFGs
• Lack of sensitivity to lexical information (words)

The only difference between these two parses:

q(VP → VP PP) vs q(NP → NP PP)
… without looking at the words!
Weaknesses of PCFGs
• Lack of sensitivity to lexical information (words)

Exactly the same set of context-free rules!

Lexicalized PCFGs
• Key idea: add headwords to trees

• Each context-free rule has one special child that is the

head of the rule (a core idea in syntax)
Lexicalized PCFGs

• Further reading: Michael Collins. 2003. Head-Driven

Statistical Models for Natural Language Parsing.

• Results for a PCFG: 70.6% recall, 74.8% precision

• Results for a lexicalized PCFG: 88.1% recall, 88.3% precision

http://nlpprogress.com/english/constituency_parsing.html

Unit Iii - NLP
No ratings yet
Unit Iii - NLP
36 pages
CYK Parsing Notes
No ratings yet
CYK Parsing Notes
5 pages
The Subject, Object and Possessive Pronouns and Possessive Adjectives
100% (2)
The Subject, Object and Possessive Pronouns and Possessive Adjectives
26 pages
Detailed Lesson Plan in English 5
100% (1)
Detailed Lesson Plan in English 5
6 pages
Probabilistic Context-Free Grammar
No ratings yet
Probabilistic Context-Free Grammar
13 pages
Week 3 - Probablistic Context Free Grammars
No ratings yet
Week 3 - Probablistic Context Free Grammars
18 pages
IELTS Starter Kit
No ratings yet
IELTS Starter Kit
33 pages
Unit 3
No ratings yet
Unit 3
19 pages
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
No ratings yet
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
13 pages
Engl 351 Notes
No ratings yet
Engl 351 Notes
45 pages
Parsing and Ambiguity in NLP
No ratings yet
Parsing and Ambiguity in NLP
18 pages
Chart Parsers PDF
No ratings yet
Chart Parsers PDF
7 pages
CFG and PCFG
No ratings yet
CFG and PCFG
7 pages
Diagramming Sentences
No ratings yet
Diagramming Sentences
15 pages
8 Parsing
No ratings yet
8 Parsing
40 pages
NLP M3 SPP
No ratings yet
NLP M3 SPP
53 pages
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
No ratings yet
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
21 pages
English For Everyone - Level 1 Beginner - Course Book by Rachel Harding, Tim Bowen, Susan Barduhn (Z-Lib - Org) - 12-87
No ratings yet
English For Everyone - Level 1 Beginner - Course Book by Rachel Harding, Tim Bowen, Susan Barduhn (Z-Lib - Org) - 12-87
76 pages
Syntactic Analysis
No ratings yet
Syntactic Analysis
66 pages
14 Ai Cse551 NLP 2 PDF
No ratings yet
14 Ai Cse551 NLP 2 PDF
39 pages
Mod - 3
No ratings yet
Mod - 3
51 pages
NLP - Shortnotes Unit 3
No ratings yet
NLP - Shortnotes Unit 3
16 pages
NLP Module 3
No ratings yet
NLP Module 3
41 pages
Lecture 07
No ratings yet
Lecture 07
35 pages
Thuật toán NLP
No ratings yet
Thuật toán NLP
57 pages
Unit 2
No ratings yet
Unit 2
94 pages
Lec15 CL1-f11
No ratings yet
Lec15 CL1-f11
5 pages
Context-Free Grammars and Parsing
No ratings yet
Context-Free Grammars and Parsing
7 pages
SCFG PCFG LCFG
No ratings yet
SCFG PCFG LCFG
25 pages
Lecture15 Parsing
No ratings yet
Lecture15 Parsing
37 pages
Advanced Grammar For IELTS
No ratings yet
Advanced Grammar For IELTS
18 pages
Parsing
No ratings yet
Parsing
27 pages
PurCom Reviewer
No ratings yet
PurCom Reviewer
12 pages
NLP Parsing Techniques
No ratings yet
NLP Parsing Techniques
54 pages
Machine Translation and Encoder
No ratings yet
Machine Translation and Encoder
13 pages
Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
No ratings yet
Longsem2024-25 Cse3015 Eth Ap2024256000125 Reference-material-III
89 pages
Parsing Techniques for NLP Students
No ratings yet
Parsing Techniques for NLP Students
60 pages
NLP Unit 2
No ratings yet
NLP Unit 2
20 pages
4.chapter5 - Syntactic and Semantic Representations
No ratings yet
4.chapter5 - Syntactic and Semantic Representations
47 pages
NLP Parsing Techniques Explained
No ratings yet
NLP Parsing Techniques Explained
11 pages
Mimamsa Ekavaakyataa PDF
No ratings yet
Mimamsa Ekavaakyataa PDF
66 pages
NLP 2 Internal
No ratings yet
NLP 2 Internal
39 pages
CH 08
No ratings yet
CH 08
31 pages
14 Syntax 1
No ratings yet
14 Syntax 1
22 pages
Module-2 ch-4
No ratings yet
Module-2 ch-4
32 pages
CFG & PCFG
No ratings yet
CFG & PCFG
15 pages
Unit 2
No ratings yet
Unit 2
53 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
Context-Free Grammars and Constituency Parsing
No ratings yet
Context-Free Grammars and Constituency Parsing
31 pages
Natural Language Processing UNIT 2
No ratings yet
Natural Language Processing UNIT 2
32 pages
Advanced NLP: CFG Parsing Guide
No ratings yet
Advanced NLP: CFG Parsing Guide
28 pages
Module 3 NLP
No ratings yet
Module 3 NLP
32 pages
Module 4
No ratings yet
Module 4
7 pages
CFG and Probabilistic Parsing Guide
No ratings yet
CFG and Probabilistic Parsing Guide
45 pages
NLP Unit-2 QB Updated
No ratings yet
NLP Unit-2 QB Updated
10 pages
CS6120 35650 - Spring2025 - Assignment - 2-1
No ratings yet
CS6120 35650 - Spring2025 - Assignment - 2-1
5 pages
NLPPR6
No ratings yet
NLPPR6
6 pages
Lect 11
No ratings yet
Lect 11
7 pages
2024PLC 04
No ratings yet
2024PLC 04
43 pages
Present Perfect Tense Guide
No ratings yet
Present Perfect Tense Guide
5 pages
NLP Unit-4
No ratings yet
NLP Unit-4
6 pages
Unit 2 New One
No ratings yet
Unit 2 New One
12 pages
NLP Sem 3 Unit
No ratings yet
NLP Sem 3 Unit
12 pages
Hdag Using HBase To Store and Access Data
No ratings yet
Hdag Using HBase To Store and Access Data
46 pages
NLP Chapter 3
No ratings yet
NLP Chapter 3
23 pages
H Base Tutorial
No ratings yet
H Base Tutorial
38 pages
NLP Unit 3 (Part 1)
No ratings yet
NLP Unit 3 (Part 1)
7 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Hindeuropay Sambahsa
No ratings yet
Hindeuropay Sambahsa
1,128 pages
Present Perfect Simple
100% (1)
Present Perfect Simple
2 pages
NLP 3
No ratings yet
NLP 3
4 pages
UG Common Misunderstanding About The Theory of UG 2
No ratings yet
UG Common Misunderstanding About The Theory of UG 2
13 pages
Lesson 6 Mixed Chart
No ratings yet
Lesson 6 Mixed Chart
8 pages
Idioms Show Effects of Meaning Relatedness and Dominance Similar To Those Seen For Ambiguous Words
No ratings yet
Idioms Show Effects of Meaning Relatedness and Dominance Similar To Those Seen For Ambiguous Words
8 pages
SSC CGL Cloze Test Vocabulary Hindi
No ratings yet
SSC CGL Cloze Test Vocabulary Hindi
3 pages
Narrative Text
No ratings yet
Narrative Text
11 pages
Year 11 English Scheme of Work 2011 Sample
No ratings yet
Year 11 English Scheme of Work 2011 Sample
4 pages
Chapter 2 CDA
No ratings yet
Chapter 2 CDA
18 pages
A Ing Infinitive 04
No ratings yet
A Ing Infinitive 04
4 pages
7456 - Speech - Consonant Clusters
No ratings yet
7456 - Speech - Consonant Clusters
4 pages
Subjunctives
No ratings yet
Subjunctives
4 pages
German Subjunctive I, II
No ratings yet
German Subjunctive I, II
6 pages
Fixing Dangling & Misplaced Modifiers
No ratings yet
Fixing Dangling & Misplaced Modifiers
2 pages
Chap 1
No ratings yet
Chap 1
1 page
Hbase Tutorial
No ratings yet
Hbase Tutorial
21 pages
Grade II Remedial Reading Plan
No ratings yet
Grade II Remedial Reading Plan
2 pages
Francophone Countries Overview
No ratings yet
Francophone Countries Overview
1 page
English Practice for 6th Graders
No ratings yet
English Practice for 6th Graders
2 pages
ATIVIDADE DE INGLÊS IF CLAUSES - Type 1
No ratings yet
ATIVIDADE DE INGLÊS IF CLAUSES - Type 1
1 page
That Football Player Is Really Putting The Team On His Back This Evening!
No ratings yet
That Football Player Is Really Putting The Team On His Back This Evening!
2 pages

Constituency Parsing PPT 2

Uploaded by

Constituency Parsing PPT 2

Uploaded by

COS 484: Natural Language Processing

(Some slides adapted from Chris Manning, Mike Collins)

• Constituency structure vs dependency structure

Two views of linguistic structure

• Starting units: words are given a category: part-of-speech tags

the, cuddly, cat, by, the, door

• Words combine into phrases with categories

• Phrases can combine into bigger phrases recursively

nsubj dobj case

Satellites spot whales from space

Satellites spot whales from space

• We need to understand sentence structure in order to be able to

• Human communicate complex ideas by composing words together

• We need to know what is connected to what

Beoing is located in Seattle.

Image credit: http://vas3k.com/blog/machine_translation/

Image credit: (Zhang et al, 2018)

• The most widely used formal system for modeling

• A context free grammar G = (N, Σ, R, S) where

S:sentence, VP:verb phrase, NP: noun phrase, PP:prepositional phrase,

• Given a CFG G, a left-most derivation is a sequence of strings

• sn: yield of the derivation

• A string s ∈ Σ* is in the language defined by the CFG if

• The set of possible derivations may be finite or infinite

((ab)c)d (a(bc))d (ab)(cd) a((bc)d) a(b(cd))

• It is also difficult to construct a grammar with enough coverage

• Adding probabilities to the rules: probabilistic CFGs (PCFGs)

Treebanks: a collection of sentences paired with their parse trees

The Penn Treebank Project (Marcus et al, 1993)

• Why building a treebank instead of a grammar?

• A way to evaluate systems

• A probabilistic context-free grammar (PCFG) consists of:

• A context-free grammar: G = (N, Σ, R, S)

• For each rule α → β ∈ R, there is a parameter q(α → β) ≥ 0.

P(t) = q(S → NP VP) × q(NP → DT NN) × q(DT → the)

= 1.0 × 0.3 × 1.0 × 0.7 × 0.4 × 1.0 = 0.084

• A PCFG (N, Σ, S, R, q):

• The CKY algorithm: applies to a PCFG in Chomsky normal

• Chomsky Normal Form (CNF): all the rules take one

• Unary rules: VP → Vi, Vi → sleeps

• Eliminate all the unary rules recursively by adding VP → sleeps

• We will come back to this later!

• Initially, for i = 1,2,…, n,

π(i, j, X) = max q(X → YZ) × π(i, k, Y ) × π(k + 1,j, Z)

Also stores backpointers which allow us to recover the parse tree

How does this change CKY?

• Compute unary closure: if there is a rule chain

• Update unary rule once after the binary rules

• Recall: (# correct constituents in candidate) / (# constituents in

• Precision: 3/7 = 42.9%

The only difference between these two parses:

Exactly the same set of context-free rules!

• Each context-free rule has one special child that is the

• Further reading: Michael Collins. 2003. Head-Driven

• Results for a PCFG: 70.6% recall, 74.8% precision

• Results for a lexicalized PCFG: 88.1% recall, 88.3% precision

You might also like