Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views12 pages

PoCD Week 05 Chapter 04 Handouts

Chapter 4 of 'Principles of Compiler Design' focuses on top-down parsing techniques, including eliminating left recursion and left factoring to make grammars suitable for predictive parsing. It discusses the construction of FIRST and FOLLOW sets, LL(1) grammars, and the creation of predictive parsing tables. Additionally, it covers error recovery strategies in top-down parsing, emphasizing methods such as panic mode and phrase-level recovery.

Uploaded by

Akshat Dodwad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views12 pages

PoCD Week 05 Chapter 04 Handouts

Chapter 4 of 'Principles of Compiler Design' focuses on top-down parsing techniques, including eliminating left recursion and left factoring to make grammars suitable for predictive parsing. It discusses the construction of FIRST and FOLLOW sets, LL(1) grammars, and the creation of predictive parsing tables. Additionally, it covers error recovery strategies in top-down parsing, emphasizing methods such as panic mode and phrase-level recovery.

Uploaded by

Akshat Dodwad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Principles of

Compiler Design
19ECSC203
Chapter 04 - Top Down Parsing

PoCD Team
School of Computer Science & Engineering
2020 - 21
Principles of Compiler Design

Chapter 4 Contents
Top Down Parsing
 Eliminating Left Recursion
 Left Factoring
Eliminating Left Recursion
 Top Down Parsing
A production has left recursion if it the below
 FIRST and FOLLOW sets
form:
AAα  LL (1) Parsing
Top down parsing methods cannot handle  Error recovery in Top Down
left recursive grammars. Hence we eliminate Parsing
them.

Example: ============================
1. A Aα | β
Can be converted to “Mind parser never knows a rule.
A  β A’ May be that’s why we end up with
A’  α A’ | Є ambiguous decisions!”

2. E  E + T | T
==========================
T T * F | F
F(E) | id

Eliminating Left recursion:


E  TE’
E’+TE’ | Є
TFT’
T’*FT’ | Є
F(E) | id

Above eliminates only the immediate left


recursion. We apply the following general
algorithm.

SoCSE Page | 2
Principles of Compiler Design

ALGORITHM Eliminating_Left_Recursion
INPUT: Grammar G with no cycles or Є-Productions
OUTPUT: An equivalent grammar with no left recursion
METHOD: Apply the algorithm to G. Note that the resulting non left recursive grammar may have Є-
Productions

arrange all non terminals in some order A1, A2, …. An


for ( each i from 1 to n) {
for ( each j from 1 to i - 1) {
replace each production of the form Ai  Ajγ by the
productions Ai  δ1γ | δ2γ | … δkγ where
Aj  δ1 | δ2 | … δk are all current Aj – productions
}
eliminate the immediate left recursion among the Aj – productions
}

Applying the above algorithm to the grammar


SAa | b
A Ac | Sd | Є

We have,
SAa | b
AbdA’ | A’
A’ cA’ | adA’ | Є

Left Factoring
Left Factoring is applied to produce a grammar suitable for predictive or top down parsing.

Consider the grammar:


stmt  if expr then stmt else stmt
| if expr then stmt

We cannot immediately tell which one to use on seeing input “if”. Hence we defer the
decision by expanding it to the next step.
Example: A αβ1 | αβ2 can be rewritten as
A αA’ and
A’ β1 | β2.

SoCSE Page | 3
Principles of Compiler Design

Top-Down Parsing
Top-down Parsing is
 Constructing a parse tree
 In Preorder
 Using Depth First Search
 By finding the leftmost derivation for an input string

Consider the Grammar:


E  TE’
E’  +TE | Є
T  *FT’
T’ *FT’ | Є
F (E) | id

For the string id + id * id using the productions, we can construct the tree as

SoCSE Page | 4
Principles of Compiler Design

It’s very common that we as human beings, being intelligent can pick up the right set of
productions and construct the tree. But how would the machine do it?
Our task is to develop an algorithm which will generate above tree by picking up the right
productions amongst the available.

SoCSE Page | 5
Principles of Compiler Design

Recursive Descent Parsing


One possible option is to go for recursive descent parsing.
Consider the grammar:
ScAd
Aab | a
Constructing a parse tree top down for w = cad

And in next step substitute for ‘A’

Above one fails. Go back to A and check if there is any alternative.

Halt and announce the successful completion of parsing.

Implementation
 Requires backtracking (not very efficient)
 Go for tabular methods such as dynamic programming algorithm
 The procedure is non deterministic in nature

FIRST and FOLLOW


FIRST:
a. If X is a terminal, then FIRST(X) is {X}
b. If X Є then add Є to FIRST(X)
c. If XY1Y2Y3…Yk, then add FIRST(Y1) to FIRST(X). If FIRST(Y1) has Є then add FIRST(Y2) to
FIRST(X) and so on till FIRST(Y k). If Є is in FIRST(Yi) where i = 1, 2, … k then add Є to the
FIRST(X)

SoCSE Page | 6
Principles of Compiler Design

FOLLOW:
a. Place a $ in FOLLOW(S), where S is the start symbol
b. if A αBβ, then everything in FIRST(β) except Є is in FOLLOW(B)
c. if A αB or A αBβ where FIRST(β) contains Є, then everything in FOLLOW(A) is in
FOLLOW(B)

For the grammar given below FIRST and FOLLOW will be,
E  TE’
E’+TE’ | Є
TFT’
T’*FT’ | Є
F(E) | id

FIRST(F) = { (, id } FOLLOW(E) = { ), $ }
FIRST(T) = { (, id } FOLLOW(E’) = { ), $ }
FIRST(E) = { (, id } FOLLOW(T) = { +, ), $ }
FIRST(E’) = { +, Є } FOLLOW(T’) = { +, ), $ }
FIRST(T’) = { *, Є } FOLLOW(F) = { *, +, ), $ }

LL(1) Grammars
Predictive parsers, that is recursive-descent parsers needing no backtracking can be
constructed for a class of grammars called LL(1).
LL(1) stands for
L – scan input from left to right
L – producing leftmost derivations
1 – one input symbol of lookahead at each step
A grammar G is LL(1) iff whenever A  α | β are two distinct productions of G and following
conditions hold:
 For no terminal ‘a’ do both α and β derive strings beginning with ‘a’
 At most one of α and β can derive the empty string
 If β derives Є in zero or more steps, then α does not derive any string beginning with
a terminal in FOLLOW(A), similarly, if α derives Є in zero or more steps, then β does
not derive any string beginning with a terminal in FOLLOW(A)

Note: No left recursive or ambiguous grammars can be LL(1)

SoCSE Page | 7
Principles of Compiler Design

Predictive Parsing Table


Using all above information we can design a better algorithm by constructing a Predictive
parsing table.

ALGORITHM Construct_Predictive_Parsing_Table
INPUT: Grammar G
OUTPUT: Parsing Table M
For each production A  α of the grammar, do the following
1. For each terminal ‘a’ in FIRST(A), add A  α to M[A, a]
2. If Є is in FIRST(α), then for each terminal ‘b’ in FOLLOW(A), add A  α to M[A, b]. If Є
is in FIRST(α) and $ in FOLLOW(A), add A  α to M[A, $] as well

Constructing a Predictive Parsing table for the considered grammar:


Non Input
Terminal id + * ( ) $
E ETE’ ETE’
E’ E’+TE’ E’ Є E’ Є
T TFT’ TFT’
T’ T’Є T’*FT’ T’ Є T’ Є
F Fid F(E)

Conclusions:
 Each parsing table entry uniquely identifies a production or signals an error
 If G is left recursive or ambiguous then it will have one multiple defined entry
 There are some grammars for which no amount of alteration will produce LL(1)
grammar

Example 02: Construct a Predictive Parsing Table for the given grammar
SiEtS|iETSeS|a
E b

Left factoring it,


S  i E t S S’ | a
S’  e S | Є
E b

SoCSE Page | 8
Principles of Compiler Design

There is no left recursion in the grammar. We would have eliminated it if present.

Write the FIRST and FOLLOW


FIRST(S) = { i, a } FOLLOW(S) = { $, e }
FIRST(S’) = { e, Є } FOLLOW(S’) = { $, e }
FIRST(E) = { b } FOLLOW(E) = { t }

Build a Predictive Parsing Table:

Non Input
Terminal a b e i t $
S Sa S  i E t S S’
S’ S’  Є S’  Є
S’  e S
E E b

Conclusion:
This grammar is dangling because there is a multiple entry defined for S’ and input symbol e.

Now using all above information we parse the given strings to check for acceptance or
rejection. We call the algorithm as “Non recursive Predictive Parsing.”

Non- Recursive Predictive Parsing


 Maintains a stack explicitly rather than implicitly via recursive calls
 Mimics leftmost derivation
 We define configuration as stack content and the remaining input
 If w is the input matched so far then stack holds a sequence of grammar symbols α
such that S derives wα in zero or more steps using left most derivation

Consider the same grammar:


E  TE’
E’+TE’ | Є
TFT’
T’*FT’ | Є
F(E) | id

SoCSE Page | 9
Principles of Compiler Design

And string id + id * id and Tracing for Non recursive predictive parsing, we have

Matched Stack Input Action


E$ id + id * id$
TE’$ id + id * id$ Output ETE’
FT’E’$ id + id * id$ Output TFT’
id T’E’$ id + id * id$ Output Fid
Id T’E’$ + id * id$ Match id
Id E’$ + id * id$ Output T’Є
Id +TE’$ + id * id$ Output E’+TE’
id + TE’$ id * id$ Match +
id + FTE’$ id * id$ Output TFT’
id + idTE’$ id * id$ Output Fid
id + id T’E’$ * id$ Match id
id + id *FT’E’$ * id$ Output T’*FT’
id + id * FT’E’$ id$ Match *
id + id * idT’E’$ id$ Output Fid
id + id * id T’E’$ $ Match id
id + id * id E’$ $ Output T’Є
id + id * id $ $ Output E’Є

Error recovery in Top Down Parsing

Error Recovery Strategies

 Panic mode: Discard input until a token in a set of designated synchronizing tokens is
found
 Phrase-level recovery: Perform local correction on the input to repair the error
 Error productions: Augment grammar with productions for erroneous constructs
 Global correction: Choose a minimal sequence of changes to obtain a global least-
cost correction

SoCSE Page | 10
Principles of Compiler Design

Error Recovery in LL Parsing


 Simple option: When see an error, print a message and halt

 “Real” error recovery

o Insert “expected” token and continue – can have a problem with termination
o Deleting tokens – for an error for non-terminal F, keep deleting tokens until
see a token in follow(F).
o For example:
E() {
if (lookahead in {(,id} ) { T(); E_prime(); } E T E’
else { printf(“E expecting ( or identifier”); Follow(E) = $ )
while (lookahead != ) or $) lookahead = nextToken();
}
}

 An error is detected whenever an empty table slot is encountered.


 We would like our parser to be able to recover from an error and continue parsing.
 Phase-level recovery
o We associate each empty slot with an error handling procedure.
 Panic mode recovery
o Modify the stack and/or the input string to try and reach state from which we
can continue.

 Panic mode recovery

 Idea:
o Decide on a set of synchronizing tokens.
o When an error is found and there's a nonterminal at the top of the stack,
discard input tokens until a synchronizing token is found.
o Synchronizing tokens are chosen so that the parser can recover quickly after
one is found
 e.g. a semicolon when parsing statements.
o If there is a terminal at the top of the stack, we could try popping it to see
whether we can continue.
 Assume that the input string is actually missing that terminal.

SoCSE Page | 11
Principles of Compiler Design

 Possible synchronizing tokens for a nonterminal A


o the tokens in FOLLOW(A)
 When one is found, pop A of the stack and try to continue
o the tokens in FIRST(A)
 When one is found, match it and try to continue
o tokens such as semicolons that terminate statements

~*~*~*~*~*~*~*~*~*~*~*~

SoCSE Page | 12

You might also like