Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
21 views4 pages

NLP Unit3 Syntactic Analysis Elaborated

Unit III discusses syntactic analysis, focusing on context-free grammars (CFGs), grammar rules for English, and treebanks. It covers various parsing techniques, including constituency and dependency parsing, as well as ambiguity and dynamic programming methods. The unit concludes by highlighting the importance of probabilistic models and feature structures in enhancing parsing accuracy and flexibility in natural language processing.

Uploaded by

Mohana Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views4 pages

NLP Unit3 Syntactic Analysis Elaborated

Unit III discusses syntactic analysis, focusing on context-free grammars (CFGs), grammar rules for English, and treebanks. It covers various parsing techniques, including constituency and dependency parsing, as well as ambiguity and dynamic programming methods. The unit concludes by highlighting the importance of probabilistic models and feature structures in enhancing parsing accuracy and flexibility in natural language processing.

Uploaded by

Mohana Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

UNIT III – SYNTACTIC ANALYSIS

1. Context-Free Grammars (CFGs)


CFGs are formal systems for describing the syntax of natural languages. A CFG consists of a
set of production rules that describe how symbols of a language can be combined.
A CFG is defined as a 4-tuple: G = (N, Σ, R, S)
- N = Non-terminal symbols
- Σ = Terminal symbols
- R = Set of production rules
- S = Start symbol
Example:
S → NP VP
VP → V NP
CFGs are powerful for modeling hierarchical structure and can be parsed using algorithms
like CYK and Earley Parser.

2. Grammar Rules for English


English grammar includes syntactic rules that describe valid sentence formations.
Examples:
- Sentence: S → NP VP
- Noun Phrase: NP → Det N | Det Adj N
- Verb Phrase: VP → V NP | V NP PP
Rules are recursive and allow for generating complex sentence structures. CFGs can capture
a large portion of English syntax but have limitations with long-distance dependencies.

3. Treebanks
Treebanks are corpora annotated with syntactic parse trees.
Examples:
- Penn Treebank (PTB)
- Universal Dependencies Treebank
Each sentence in a treebank is annotated using a CFG-based syntactic tree. Treebanks are
used to train and evaluate parsers and grammar induction systems.
4. Normal Forms for Grammar
To facilitate parsing, CFGs are converted to normal forms:
- **Chomsky Normal Form (CNF)**: All rules are in the form A → BC or A → a
- **Greibach Normal Form (GNF)**: A → aα where α ∈ N*
CNF is used in CYK parsing, enabling efficient table-driven parsing techniques.

5. Dependency Grammar
Dependency grammar models syntax based on head-dependent relationships rather than
phrase structure.
Features:
- Each word (except root) depends on another word (its head)
- Easier to handle free-word-order languages
- Represented using dependency trees
Useful in practical NLP tasks such as information extraction and semantic role labeling.

6. Syntactic Parsing
Parsing is the process of analyzing a sentence to reveal its syntactic structure.
Types:
- **Constituency Parsing**: Uses CFGs to build phrase-structure trees.
- **Dependency Parsing**: Constructs head-dependent relations.
Parsing Strategies:
- Top-down parsing
- Bottom-up parsing
- Chart parsing
Applications include grammar checking, machine translation, and semantic analysis.

7. Ambiguity in Parsing
Parsing ambiguity arises when a sentence has multiple valid parse trees.
Types:
- Lexical ambiguity
- Syntactic ambiguity (e.g., PP attachment: 'I saw the man with the telescope')
Resolving ambiguity requires semantic and contextual knowledge or probabilistic models.

8. Dynamic Programming Parsing


Dynamic programming techniques like CKY (Cocke–Kasami–Younger) algorithm are used to
efficiently parse sentences using CNF CFGs.
- CKY builds a parse table in a bottom-up manner.
- Avoids redundant computations by reusing intermediate results.
Used in both symbolic and probabilistic parsing.

9. Shallow Parsing
Also called chunking, shallow parsing identifies syntactic constituents without generating
full parse trees.
- Identifies base NP chunks: 'the big red ball'
- Faster and more robust than full parsing
Applications:
- Information extraction
- Named entity recognition

10. Probabilistic Context-Free Grammar (PCFG)


PCFGs enhance CFGs by associating probabilities with production rules.
Each rule A → B has a probability P(B|A) based on relative frequencies from a treebank.
Benefits:
- Resolves ambiguity by preferring more likely parse trees.
- Allows ranking of multiple parses.
Trained using annotated corpora like PTB.

11. Probabilistic CYK Parsing


Probabilistic CYK algorithm applies the CYK method using PCFGs in CNF.
Each cell in the parse table stores:
- Non-terminal
- Probability of that non-terminal spanning the substring
The best parse is selected based on maximum likelihood.

12. Probabilistic Lexicalized CFGs


These CFGs augment production rules with lexical (head word) information.
Example:
VP(head=‘eat’) → V(head=‘eat’) NP(head=‘pizza’)
Benefits:
- Captures subcategorization preferences
- Improves parsing accuracy
Used in statistical parsers like Collins Parser.

13. Feature Structures


Feature structures are attribute-value pairs used to represent grammatical information.
Example:
[Category: NP, Number: Plural, Gender: Masculine]
Used to encode constraints such as subject-verb agreement.
Common in unification grammars and LFG (Lexical Functional Grammar).

14. Unification of Feature Structures


Unification is the process of combining compatible feature structures.
- Compatible: No conflicting values
- Incompatible: Feature clash leads to parsing failure
Used in constraint-based grammars (e.g., HPSG)
Enables fine-grained control over syntactic rules.

Conclusion
Syntactic analysis provides the structural foundation for understanding sentence meaning.
From CFGs to dependency parsing and probabilistic models, these techniques are vital for
building robust NLP systems. Feature structures and unification enable flexible rule
representations for complex grammatical phenomena.

You might also like