0% found this document useful (0 votes)

15 views51 pages

Lecture 04

The document provides an introduction to parsing, focusing on the role of parsers in distinguishing valid sequences of tokens in programming languages. It discusses context-free grammars (CFGs), their structure, and the concept of ambiguity in grammars, including examples of ambiguous expressions and methods to resolve such ambiguities. Additionally, it highlights the importance of operator precedence and associativity in defining unambiguous grammars.

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views51 pages

Lecture 04

Uploaded by

nihafahima9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Introduction to Parsing

Ambiguity and Removing Ambiguity

Outline

• Regular languages revisited

• Parser overview

• Context-free grammars (CFG’s)

• Derivations

• Ambiguity

Compiler Design 1 (2011) 2

Languages and Automata

• Formal languages are very important in CS

– Especially in programming languages

• Regular languages
– The weakest formal languages widely used
– Many applications

• We will also study context-free languages

Compiler Design 1 (2011) 3

Limitations of Regular Languages

Intuition: A finite automaton that runs

long enough must repeat states
• A finite automaton cannot remember #
of times it has visited a particular state
• because a finite automaton has finite memory
– Only enough to store in which state it is
– Cannot count, except up to a finite limit
• Many languages are not regular
• E.g., language of balanced parentheses is not
regular: { (i )i | i ≥ 0}

Compiler Design 1 (2011) 4

The Functionality of the Parser

• Input: sequence of tokens from lexer

• Output: parse tree of the program

Compiler Design 1 (2011) 5

Example

• If-then-else statement
if (x == y) then z =1; else z = 2;
• Parser input
IF (ID == ID) THEN ID = INT; ELSE ID =
INT;
• Possible parser output

IF-THEN-ELSE

== = =
ID ID ID INT ID INT
Compiler Design 1 (2011) 6
Comparison with Lexical Analysis

Phase Input Output

Lexer Sequence of Sequence of

characters tokens

Parser Sequence of Parse tree

tokens

Compiler Design 1 (2011) 7

The Role of the Parser

• Not all sequences of tokens are programs . . .

• . . . Parser must distinguish between valid and
invalid sequences of tokens

• We need
– A language for describing valid sequences of tokens
– A method for distinguishing valid from invalid
sequences of tokens

Compiler Design 1 (2011) 8

Context-Free Grammars

• Many programming language constructs have a

recursive structure

• A STMT is of the form

if COND then STMT else STMT ,
or while COND do STMT , or
…
• Context-free grammars are a natural notation
for this recursive structure

Compiler Design 1 (2011) 9

CFGs (Cont.)

• A CFG consists of
– A set of terminals T
– A set of non-terminals N
– A start symbol S (a non-terminal)
– A set of productions

Assuming X ∈ N the productions are of the

formX → ε , or
X → Y1 Y2 ... Yn where Y ∈ N ∪T
i

Compiler Design 1 (2011) 10

Notational Conventions

• In these lecture notes

– Non-terminals are written upper-case
– Terminals are written lower-case
– The start symbol is the left-hand side of the
first production

Compiler Design 1 (2011) 11

Examples of CFGs

A fragment of our example language (simplified):

STMT → if COND then STMT else STMT

⏐ while COND do STMT
⏐ id = int

Compiler Design 1 (2011) 12

Examples of CFGs (cont.)

Grammar for simple arithmetic expressions:

E →E * E
⏐ E+E
⏐ (E)
⏐ id

Compiler Design 1 (2011) 13

The Language of a CFG

Read productions as replacement rules:

X → Y1 ... Yn
Means X can be replaced by Y1 ... Yn
X→ε
Means X can be erased (replaced with empty
string)

Compiler Design 1 (2011) 14

Key Idea

(1) Begin with a string consisting of the start

symbol “S”
(2) Replace any non-terminal X in the string
by a right-hand side of some production

X → Y1 LYn
(3) Repeat (2) until there are no non-terminals in
the string

Compiler Design 1 (2011) 15

The Language of a CFG (Cont.)

More formally, we write

X1 LXi LXn → X1 LXi−1Y1 LYm Xi+1 LXn

if there is a production

Xi → Y1 LYm

Compiler Design 1 (2011) 16

The Language of a CFG (Cont.)

Write
X L X →* Y LY
1 n 1 m
if
X1 L Xn →L →L → Y1 LYm

in 0 or more steps

Compiler Design 1 (2011) 17

The Language of a CFG

Let G be a context-free grammar with start

symbol S. Then the language of G is:

{ a1…a
→
n | S *

…a
a1 n and every ai is a
terminal
}

Compiler Design 1 (2011) 18

Terminals

• Terminals are called so because there are no

rules for replacing them

• Once generated, terminals are permanent

• Terminals ought to be tokens of the language

Compiler Design 1 (2011) 19

Examples

L(G) is the language of the CFG G

Strings of balanced parentheses

(i )i | i ≥
Two grammars:
{
S → (S ) S → (S 0)
O }
S → ε R |ε

Compiler Design 1 (2011) 20

Example

A fragment of our example language (simplified):

STMT → if COND then STMT

⏐ if COND then STMT else STMT
⏐ while COND do STMT
⏐ id = int
COND → (id == id)
⏐ (id != id)

Compiler Design 1 (2011) 21

Example (Cont.)

Some elements of the our language

id = int
if (id == id) then id = int else id = int
while (id != id) do id = int
while (id == id) do while (id != id) do id = int
if (id != id) then if (id == id) then id = int else id = int

Compiler Design 1 (2011) 22

Arithmetic Example

Simple arithmetic expressions:

E → E+E | E *E | (E) | id
Some elements of the language:

id id + id
(id) id* id id
(id) * id * (id)
Compiler Design 1 (2011) 23
Notes

The idea of a CFG is a big step.

But:

• Membership in a language is just “yes” or “no”;

we also need the parse tree of the input

• Must handle errors gracefully

• Need an implementation of CFG’s (e.g., yacc)

Compiler Design 1 (2011) 24

More Notes

• Form of the grammar is important

– Many grammars generate the same language
– Parsing tools are sensitive to the grammar

Note: Tools for regular languages (e.g., lex/ML-Lex)

are also sensitive to the form of the regular
expression, but this is rarely a problem in practice

Compiler Design 1 (2011) 25

Derivations and Parse Trees

A derivation is a sequence of productions

S →L →L →L
A derivation can be drawn as a tree
– Start symbol is the tree’s root
– For a production add children
X → Y LY
1 n Y1
LYn
to node
X

Compiler Design 1 (2011) 26

Derivation Example

• Grammar

E → E+E | E *E | (E) | id
• String

id * id + id

Compiler Design 1 (2011) 27

Derivation Example (Cont.)

E
E
→ E+E
E + E
→ E * E+E
→ id *E + E E * E id
→ id *id + E
id id
→ id *id +
id
Compiler Design 1 (2011) 28
Notes on Derivations

• A parse tree has

– Terminals at the leaves
– Non-terminals at the interior nodes

• An in-order traversal of the leaves is the

original input

• The parse tree shows the association of

operations, the input string does not

Compiler Design 1 (2011) 29

Leftmost and Rightmost Derivations

• The example is a
left-most derivation
– At each step, replace the
E
left-most non-terminal
→ E+E
• There is an equivalent → E+id
notion of a
right-most → E * E + id
→ E *id + id
derivation

→ id *id +
id
Compiler Design 1 (2011) 30
Derivations and Parse Trees

• Note that right-most and leftmost

derivations have the same parse tree

• The difference is just in the order in

which branches are added

Compiler Design 1 (2011) 31

Summary of Derivations

• We are not just interested in whether

s ∈ L(G)
– We need a parse tree for s

• A derivation defines a parse tree

– But one parse tree may have many derivations

• Left-most and right-most derivations are

important in parser implementation

Compiler Design 1 (2011) 32

Ambiguity

•What is Ambiguous Grammar?

• A CFG is ambiguous if there exists more than one
derivation tree for a given input string.
• This occurs when both Left-Most Derivation Trees
(LMDT) and Rightmost Derivation Trees (RMDT) can be
generated for the same string.
• This creates uncertainty about how to parse certain
strings, leading to multiple interpretations.

• Grammar
E → E + E | E * E |( E ) | int

• String
int * int + int
Compiler Design 1 (2011) 33
Ambiguity (Cont.)

This string has two parse trees

E E

E + E E * E

E * E int int E + E

int int int int

Compiler Design 1 (2011) 34

Ambiguity (Cont.)

• A grammar is ambiguous if it has more

than one parse tree for some string
– Equivalently, there is more than one right-most or
left-most derivation for some string
• Ambiguity is bad
– Leaves meaning of some programs ill-defined
• Ambiguity is common in programming languages
– Arithmetic expressions
– IF-THEN-ELSE

Compiler Design 1 (2011) 35

S->aSbS | bSaS | ∈
S S
/\ /\
a S b S
/\ /\
b S a S
/\ /\
a S b S
/\ /\
b S a S
| |
(empty) (empty)
Grammar:
E -> E + E Input string: id + id* id
E -> E * E
E -> id
The leftmost derivation can be done in
1.E -> E + E two ways: 1.E -> E * E
2.id + E 2. E + E * E
3.id + E * E 3. id + E * E
4.id + id * E 4. id + id * E
5.id + id * id 5. id + id * id

For the given input string, we got two leftmost derivation

trees. We need to eliminate the ambiguity in the grammar.
Dealing with Ambiguity

There are several ways to handle ambiguity

Modifying Grammar Rules:

Change the production rules to ensure a unique parse tree for
each valid string.
E→T+E|T
T → int * T | int | ( E )

Operator Precedence and Associativity:

Define the precedence and associativity of operators explicitly.

Enforces precedence of * over +

Compiler Design 1 (2011) 39

Modifying Grammar

E → E + E | E * E | (E) | id
This grammar is ambiguous because the expression id + id * id can have
multiple parse trees, leading to different interpretations (e.g.,
left-associative vs. right-associative parsing).

E→E+T|T
T→T*F|F
F → (E) | id

In this grammar:
•+ has lower precedence than *.
•+ is left-associative.
•* is left-associative.
Ambiguity: The Dangling Else
• Consider the following grammar

S → if C then S

|if C then S else S

|OTHER

• This grammar is also ambiguous

Compiler Design 1 (2011)

The Dangling Else: Example

• The expression
if C1 then if C2 then S3 else S4
has two parse trees

if if

C1 if S4 C1 if

C2 S3 C 2 S3 S4

• Typically we want the second form

Compiler Design 1 (2011) 42
The Dangling Else: A Fix

• else matches the closest unmatched then

• We can describe this in the grammar

S→ /* all then are matched */

MIF /* some then are unmatched */
| →
MIF UIF
if C then MIF else MIF
| OTHER
UIF → if C then S
| if C then MIF else UIF

• Describes the same set of strings

Compiler Design 1 (2011) 43

The Dangling Else: Example Revisited

• The expression if C1 then if C2 then S3 else S4

if if

C1 if C1 if S4

C2 S3 S4 C 2 S3

• A valid parse tree • Not valid because the

(for a UIF) then expression is
not a MIF

Compiler Design 1 (2011) 44

Ambiguity

• No general techniques for handling ambiguity

• Impossible to convert automatically an

ambiguous grammar to an unambiguous one

• Used with care, ambiguity can simplify the

grammar
– Sometimes allows more natural definitions
– We need disambiguation mechanisms

Compiler Design 1 (2011) 45

Precedence and Associativity Declarations

• Instead of rewriting the grammar

– Use the more natural (ambiguous) grammar
– Along with disambiguating declarations

• Most tools allow precedence and associativity

declarations to disambiguate grammars

• Examples …

Compiler Design 1 (2011) 46

Associativity Declarations

• Consider the grammar E → E + E | int

• Ambiguous: two parse trees of int + int + int

E E

E + E E + E

E + E int int E + E

int int int int

• Left associativity declaration: %left +

Compiler Design 1 (2011) 47

Precedence Declarations

• Consider the grammar E → E + E | E * E | int

– And the string int + int * int

E E

E * E E + E

E + E int int E * E

int int int int

• Precedence declarations: %left
+
%left *
Compiler Design 1 (2011) 48
Grammar
1.X -> X - X
2.X -> var/const
Here var can be any variable, and const can be any constant value. A
string a - b - c has two leftmost derivations:

1.X -> X - X 1.X -> X - X

2. X - X - X 2. var - X - X
3. var - var - var 3. a - var - var
4. a - b - c 4. a-b-c
For example, if we take the values a = 2, b = 3 and c = 4:
a - b - c = 2 - 3 - 4 = -5
In the first derivation tree, according to the order of substitution,
the expression will be evaluated as:
(a - b) - c = (2 - 3) - 4 = -1 -4 = -5
In the second derivation tree: a - (b - c) = 2 - (3 - 4) = 2 - -1 = 3
Observe that both parse trees aren't giving the same value. They
have different meanings. In the above example, the first derivation
tree is the correct parse tree for grammar.

(a - b) - c. Here there are two same

operators in the expression. According
to mathematical rules, the expression
must be evaluated based on the
associativity of the operator
Grammar:
E -> E + E Input string: id + id* id
E -> E * E
E -> id
The leftmost derivation can be done in
two ways:
1.E -> E + E 1.E -> E * E
2.id + E If id = 2: 2. E + E * E
3.id + E * E If + id * id = 2 + 2 * 2 = 6 3. id + E * E
4.id + id * E 4. id + id * E
5.id + id * id 5. id + id * id

id + (id * id) = 2 + (2 * 2) = 2 + 4 = 6 (id + id) * id = (2 + 2) * 2 = 4*2 = 8

Unit-3 Context Free Grammar
No ratings yet
Unit-3 Context Free Grammar
57 pages
Unit-2 Syntax Analysis
No ratings yet
Unit-2 Syntax Analysis
27 pages
Principles of Programming Languages: Syntax Analysis
100% (1)
Principles of Programming Languages: Syntax Analysis
51 pages
Chapter 4 Intro - To - Parsing
No ratings yet
Chapter 4 Intro - To - Parsing
53 pages
ContextFreeGrammars Myppt
No ratings yet
ContextFreeGrammars Myppt
41 pages
Lecture 05
No ratings yet
Lecture 05
58 pages
Multimedia Application L4
No ratings yet
Multimedia Application L4
42 pages
Chapter 2
No ratings yet
Chapter 2
47 pages
Context-Free Languages & Grammars Explained
No ratings yet
Context-Free Languages & Grammars Explained
40 pages
Compiler Design: Syntactic Analysis
No ratings yet
Compiler Design: Syntactic Analysis
96 pages
Compiler Unit Ii
No ratings yet
Compiler Unit Ii
67 pages
Principles of Programming Language
No ratings yet
Principles of Programming Language
44 pages
Automata Lectuee5
No ratings yet
Automata Lectuee5
33 pages
CS6109 Module 4
No ratings yet
CS6109 Module 4
36 pages
Local Exam Guidelines
No ratings yet
Local Exam Guidelines
8 pages
Unit 2
No ratings yet
Unit 2
168 pages
Simple Syntax Directed Translation
No ratings yet
Simple Syntax Directed Translation
51 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
Context-Free Grammar Basics
No ratings yet
Context-Free Grammar Basics
57 pages
Lecture 4 - Syntax Analysis
No ratings yet
Lecture 4 - Syntax Analysis
66 pages
Compiler Construction Week 04 Syntax Analysis I)
No ratings yet
Compiler Construction Week 04 Syntax Analysis I)
41 pages
Arif
No ratings yet
Arif
45 pages
CH2-1 To CH2-3
No ratings yet
CH2-1 To CH2-3
79 pages
CC Lec 7
No ratings yet
CC Lec 7
16 pages
(Week 3) Syntax Analysis (Derivation)
No ratings yet
(Week 3) Syntax Analysis (Derivation)
46 pages
Context Free Grammars
No ratings yet
Context Free Grammars
40 pages
2019-11-29 04 41 39CS V Sem Compiler Design
No ratings yet
2019-11-29 04 41 39CS V Sem Compiler Design
10 pages
Lec4 SyntaxAnalysis
No ratings yet
Lec4 SyntaxAnalysis
41 pages
Lecture 9
No ratings yet
Lecture 9
22 pages
Gramatici Exemplu
No ratings yet
Gramatici Exemplu
45 pages
ContextFreeGrammars
No ratings yet
ContextFreeGrammars
28 pages
08 CFG
No ratings yet
08 CFG
41 pages
Parsing Part - 1
No ratings yet
Parsing Part - 1
53 pages
17 CFGremove Ambiguity Optional
No ratings yet
17 CFGremove Ambiguity Optional
30 pages
2nd Semester English Revision. Grade 10
No ratings yet
2nd Semester English Revision. Grade 10
13 pages
Automata & Compiler Design Guide
No ratings yet
Automata & Compiler Design Guide
56 pages
French Grammar (Beginner Level)
No ratings yet
French Grammar (Beginner Level)
80 pages
CD Unit-2 (R20)
No ratings yet
CD Unit-2 (R20)
38 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
6 pages
4 - Syntax Analyzer (CFG)
No ratings yet
4 - Syntax Analyzer (CFG)
41 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
Context Free Grammars
No ratings yet
Context Free Grammars
39 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Module1 1
No ratings yet
Module1 1
20 pages
Compiler Lecture 4
No ratings yet
Compiler Lecture 4
17 pages
Noteartificial Intelligence
No ratings yet
Noteartificial Intelligence
23 pages
Lecture 01
No ratings yet
Lecture 01
47 pages
English Grammar For Students of Chinese - The Study Guide For Those Learning Chinese-Olivia and Hill Pres
50% (2)
English Grammar For Students of Chinese - The Study Guide For Those Learning Chinese-Olivia and Hill Pres
118 pages
1st Semi-Quarter Exam Grammar 8
No ratings yet
1st Semi-Quarter Exam Grammar 8
3 pages
CNF Module 5
No ratings yet
CNF Module 5
20 pages
Compilers - Week 3
No ratings yet
Compilers - Week 3
17 pages
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
No ratings yet
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
56 pages
Compiler 3
No ratings yet
Compiler 3
11 pages
Syntax Analysis: Chapter - 4
No ratings yet
Syntax Analysis: Chapter - 4
41 pages
Formal Languages and Automata Theory: CH 4: Context Free Languages
No ratings yet
Formal Languages and Automata Theory: CH 4: Context Free Languages
59 pages
Entrepreneurship Process
No ratings yet
Entrepreneurship Process
22 pages
Context-Free Grammars in Compiler Design
No ratings yet
Context-Free Grammars in Compiler Design
35 pages
Chapter 3 Syntax Analysis (Parsing)
No ratings yet
Chapter 3 Syntax Analysis (Parsing)
29 pages
Context Free Grammars
No ratings yet
Context Free Grammars
36 pages
Compiler Syntax & Yacc Guide
No ratings yet
Compiler Syntax & Yacc Guide
21 pages
CFG (31 34)
No ratings yet
CFG (31 34)
78 pages
Theme 5 Thunder Cake
No ratings yet
Theme 5 Thunder Cake
2 pages
10 Project 1gu2p1 - Instruction Manual Bilingual Dictionary
No ratings yet
10 Project 1gu2p1 - Instruction Manual Bilingual Dictionary
9 pages
Snowman Verbs For Now and The Past
No ratings yet
Snowman Verbs For Now and The Past
5 pages
Present Continuous
No ratings yet
Present Continuous
23 pages
Modals (Microteaching by Eisha Sabila)
No ratings yet
Modals (Microteaching by Eisha Sabila)
23 pages
Hopefully Today's Lecture: Context Free Grammar (CFG)
No ratings yet
Hopefully Today's Lecture: Context Free Grammar (CFG)
32 pages
Lecture 02
No ratings yet
Lecture 02
150 pages
Lecture 3 Compiler Design
No ratings yet
Lecture 3 Compiler Design
12 pages
Lecture 08
No ratings yet
Lecture 08
36 pages
Context Free Grammars
No ratings yet
Context Free Grammars
40 pages
47th Preli ST-8 - (BA) Set-1 - Question
No ratings yet
47th Preli ST-8 - (BA) Set-1 - Question
2 pages
Compiler Questions
No ratings yet
Compiler Questions
50 pages
Prepositional Phrases
No ratings yet
Prepositional Phrases
5 pages
Chaucer PDF
No ratings yet
Chaucer PDF
109 pages
Prueba (Green Gyms) Instrucciones: A) Duración: 1h30m. B) No Se Permite El Uso de Diccionario
No ratings yet
Prueba (Green Gyms) Instrucciones: A) Duración: 1h30m. B) No Se Permite El Uso de Diccionario
2 pages
LI L2 Unit Test 1A
No ratings yet
LI L2 Unit Test 1A
2 pages
SLUWE
No ratings yet
SLUWE
44 pages
المصادر السماعية في الثلث الأخير من القرآن الكريم دراسة صرفية تحليلية - الجزء الأول
No ratings yet
المصادر السماعية في الثلث الأخير من القرآن الكريم دراسة صرفية تحليلية - الجزء الأول
87 pages
English 8 Q3 WK6
No ratings yet
English 8 Q3 WK6
27 pages
5 Constituent Analysis
No ratings yet
5 Constituent Analysis
21 pages
Irregular Verbs Chart Extended 278 Verbs
No ratings yet
Irregular Verbs Chart Extended 278 Verbs
10 pages
3RD Month Exam 9
No ratings yet
3RD Month Exam 9
2 pages
Grammar Checker Prototype Study
No ratings yet
Grammar Checker Prototype Study
6 pages
Too Many Choices: Unhappiness?
No ratings yet
Too Many Choices: Unhappiness?
5 pages
Parts of Speech and The Sentence
No ratings yet
Parts of Speech and The Sentence
2 pages
English Grammar Exercises
No ratings yet
English Grammar Exercises
2 pages
Interjection Vs Exclamation
No ratings yet
Interjection Vs Exclamation
2 pages
Test 3 PDF
No ratings yet
Test 3 PDF
2 pages
1
No ratings yet
1
1 page