Top-Down Parsing Techniques

This document discusses top-down parsing techniques, including recursive descent parsing and LL(1) parsing. It begins by explaining top-down parsing and how it constructs parse trees using a preorder traversal. It then covers recursive descent parsing, including how to handle repetition, choice, and error recovery using EBNF notation. LL(1) parsing is introduced as an alternative that uses an explicit stack instead of recursion. The key aspects of LL(1) parsing include the LL(1) parsing table, which expresses the possible rule choices for each non-terminal based on the next input token, and the LL(1) parsing algorithm.

Uploaded by

gdayanand4u

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views17 pages

Top-Down Parsing Techniques

Uploaded by

gdayanand4u

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 17

Chapter 4 Top-Down Parsing

OUTLINE
Top-Down Parsing It parses an input string of tokens by tracing out the steps in a leftmost derivation. And the implied traversal of the parse tree is a preorder traversal and, thus, occurs from the root to the leaves. The example: number + number, and corresponds to the parse tree exp

exp

number + number The above parse tree is corresponds to the leftmost derivations: (1) exp => exp op exp (2) => number op exp (3) => number + exp (4) => number + number Two forms of Top-Down Parsers Predictive parsers: attempts to predict the next construction in the input string using one or more look-ahead tokens Backtracking parsers: try different possibilities for a parse of the input, backing up an arbitrary amount in the input if one possibility fails. It is more powerful but much slower, unsuitable for practical compilers. Two kinds of Top-Down parsing algorithms Recursive-descent parsing: is quite versatile and suitable for a handwritten parser. LL(1) parsing: The first L refers to the fact that it processes the input from left to right; The second L refers to the fact that it traces out a leftmost derivation for the input string; The number 1 means that it uses only one symbol of input to predict the direction of the parse. Look-Ahead Sets First and Follow sets: are required by both recursive-descent parsing and LL(1) parsing. A TINY Parser It is constructed by recursive-descent parsing algorithm. Error recovery methods The error recovery methods used in Top-Down parsing will be described.

4.1 TOP-DOWN PARSING BY RECURSIVE-DESCENT

4.1.1 The Basic Method of Recursive-Descent The idea of Recursive-Descent Parsing We view the grammar rule for a non-terminal A as a definition for a procedure to recognize an A; The right-hand side of the grammar for A specifies the structure of the code for this procedure.

The first example The Expression Grammar: expr expr addop termterm addop + term term mulop factor factor mulop * factor (expr) number A recursive-descent procedure that recognizes a factor is as follows (in pseudo-code): Procedure factor BEGIN Case token of ( : match( ( ); expr; match( )); number: match(number); else error; end case; END factor

Where, the token keeps the current next token in the input (one symbol of lookahead); The Match procedure matches the current next token with its parameters, advances the input if it succeeds, and declares error if it does not: Procedure match( expectedToken); Begin If token = expectedToken then GetToken; Else Error; Endif; End match. Notes Writing recursive-decent procedure for the remaining rules in the expression grammar is not as easy for factor. It requires the use of EBNF.

4.1.2 Repetition and Choice: Using EBNF The second example The grammar rule for an if-statement: If-stmt if ( exp ) statement if ( exp ) statement else statement The procedure that can be translated into: Procedure ifstmt; Begin Match( if ); Match( ( ); Exp; Match( ) ); Statement; If token = else then Match (else); Statement; Endif; End ifstmt; In this example, we could not immediately distinguish the two choices. The EBNF of the if-statement is as follows: If-stmt if ( exp ) statement [ else statement] Where, the square brackets of the EBNF are translated into a test in the code for ifstmt. If token = else then Match (else); Statement; Endif; Notes: EBNF notation is designed to mirror closely the actual code of a recursivedescent parser, So a grammar should always be translated into EBNF if recursive-descent is to be used. It is natural to write a parser that matches each else token as soon as it is encountered in the input. Consider the exp in the grammar for simple arithmetic expression in BNF: expr expr addop termterm If we were to try to turn this into a recursive exp procedure, this would lead

to an immediate infinite recursive loop. The solution is to use the EBNF rule: expr term {addop term} Where, the curly bracket expressing repetition can be translated into the code for a loop: Procedure exp; Begin Term; While token = + or token = - do Match(token); Term; End while; End exp; Similarly, the EBNF rule for term: term factor {mulop factor} Becomes the code Procedure term; Begin factor; While token = * do Match(token); factor; End while; End exp;

A question: whether the left associatively implied by the curly bracket (and explicit in the original BNF) can still be maintained within this code. A recursive-descent calculator for the simple integer arithmetic of our grammar: Function exp: integer; Var temp: integer; Begin Temp:=term; While token=+ or token = - do Case token of + : match(+); temp:=temp+term; -:match(-); temp:=temp-term; end case; end while; return temp; end exp; We can ensure that the operations are left associative by performing the operations as we cycle through the loop. A working simple calculator in C code /*Simple integer arithmetic calculator according to the EBNF; <exp> <term> { <addop> <term>} <addop> + <term> <factor> { <mulop> <factor> } <mulop> * <factor> ( <exp> ) Number inputs a line of text from stdin outputs error or the result. */

#include <stdio.h> #include <stdio.h> char token; /* global token variable */ /*function prototype for recursive calls*/ int exp(void); int term(void); int factor(void); void error(void) {fprint(stderr, error\n); exit(1); } void match(char expectedToken) {if (token==expectedToken) token=getchar(); else error(); } main() { int result; token=getchar();/*load token with first character for lookahead*/ result=exp(); if (token==\n) /*check for end of line*/ printf(Result = %d\n, result); else error(); /*extraneous chars on line*/ return 0; } int exp(void) { int temp =term(); while ((token==+) || token==-)) switch (token) { case +: match (+); temp+=term(); break; case -: match (-); temp-=term(); break; } return temp; }

int term(void) {int temp=factor(); while (token==*){ match(*); temp*=factor(); } return temp; } int factor(void) { int temp; if (token==() { match ((); temp = exp(); match()); } else if (isdigit(token)){ ungetc(token,stdin); scanf(%d,&temp); token = getchar(); } else error(); return temp; }

Notes The method of turning grammar rule in EBNF into code is quite powerful. However, there are a few pitfalls, and care must be taken in scheduling the actions within the code. In the previous pseudo-code for exp: (1) The match of operation should be before repeated calls to term; (2) The global token variable must be set before the parse begins; (3) The getToken must be called just after a successful test of a token Construction of the syntax tree The expression: 3+4+5 + + 5

3 4 The pseudo-code for the exp procedure to construct the syntax tree: function exp : syntaxTree; Var temp, newtemp: syntaxTree; begin Temp:=term; While token=+ or token = - do Case token of + : match(+); newtemp:=makeOpNode(+); leftChild(newtemp):=temp; rightChild(newtemp):=term; temp=newtemp; -:match(-); newtemp:=makeOpNode(-); leftChild(newtemp):=temp; rightChild(newtemp):=term; temp=newtemp; end case; end while; return temp; end exp; The simpler function exp : syntaxTree; Var temp, newtemp: syntaxTree; begin Temp:=term;

While token=+ or token = - do newtemp:=makeOpNode(token); match(token); leftChild(newtemp):=temp; rightChild(newtemp):=term; temp=newtemp; end while; return temp; end exp; The pseudo-code for the if-statement procedure to construct the syntax tree: Function ifstatement: syntaxTree; Var temp:syntaxTree; Begin Match(if); Match((); Temp:= makeStmtNode(if); TestChild(temp):=exp; Match()); ThenChild(temp):=statement; If token= else then Match(else); ElseChild(temp):=statement; Else ElseChild(temp):=nil; End if; End ifstatement

4.1.3 Further Decision Problems The recursive-descent method is quite powerful and adequate to construct a complete parse. But we need more formal methods to deal with complex situation. (1) It may be difficult to convert a grammar in BNF into EBNF form; (2) It is difficult to decide when to use the choice A and the choice A ;if both and begin with non-terminals. First Sets. (3) It may be necessary to know what token legally coming from the nonterminal A, in writing the code for an -production: A.Follow Sets. (4) It requires computing the First and Follow sets in order to detect the errors as early as possible. Such as )3-2), the parse will descend from exp to term to factor before an error is reported.

4.2 LL(1) PARSING 4.2.1 The Basic Method of LL(1) Parsing Main idea: LL(1) Parsing uses an explicit stack rather than recursive calls to perform a parse. An example: a simple grammar for the strings of balanced parentheses: S(S) S The following table shows the actions of a top-down parser given this grammar and the string ( ): Steps 1 2 3 4 5 6 Parsing Stack $S $S)S( $S)S $S) $S $ Input ()$ ()$ )$ )$ $ $ Action S(S) S match S match S accept

A top-down parser begins by pushing the start symbol onto the stack. It accepts an input string if, after a series of actions, the stack and the input become empty. A general schematic for a successful top-down parse: $ StartSymbol Inputstring$ one of the two actions one of the two actions $ $ accept The two actions: (1) Generate: Replace a non-terminal A at the top of the stack by a string (in reverse) using a grammar rule A , and (2) Match: Match a token on top of the stack with the next input token.

The list of generating actions in the above table: S => (S)S [S(S) S] => ( )S [S] => ( ) [S] Which corresponds precisely to the steps in a leftmost derivation of string ( ). This is the characteristic of top-down parsing. Constructing a parse tree: Adding node construction actions as each non-terminal or terminal is push onto the stack.

4.2.2 The LL(1) Parsing Table and Algorithm Purpose of the LL(1) Parsing Table: To express the possible rule choices for a non-terminal A when the A is at the top of parsing stack based on the current input token (the look-ahead). The LL(1) Parsing table for the following simple grammar: S(S) S M[N,T] S ( S(S) S ) S $ S

The general LL(1) Parsing table definition: The table is a two-dimensional array indexed by non-terminals and terminals containing production choices to use at the appropriate parsing step, which called M[N,T]. Where, N is the set of non-terminals of the grammar; T is the set of terminals or tokens (including $); Any entrances remaining empty represent potential errors. The table-constructing rule: (Supposed that the table is originally empty) * (1) If Ais a production choice, and there is a derivation =>a, where a is a token, then add Ato the table entry M[A,a]; * * (2) If Ais a production choice, and there are derivations =>and S$=>Aa, where S is the start symbol and a is a token (or $), then add Ato the table entry M[A,a]; The constructing-process of the above table: (1) For the production : S(S) S, =(S)S, where a=(, this choice will be added to the entry M[S,( ) ( and only); * (2) For the production: S, =a, i.e. there are derivation=>and S$=>Aa=(S)S$. where a=) or a=$. So add the choice Sto the both M[S,]] and M[S,$]. Definition of LL(1) Grammar: A grammar is an LL(1) grammar if the associated LL(1) parsing table has at most on production in each table entry. An LL(1) grammar cannot be ambiguous.

A Parsing Algorithm Using the LL(1) Parsing Table: (* assumes $ marks the bottom of the stack and the end of the input *) push the start symbol onto the top the parsing stack; while the top of the parsing stack $ and the next input token $ do if the top of the parsing stack is terminal a and the next input token = a then (* match *) pop the parsing stack; advance the input; else if the top of the parsing stack is non-terminal A and the next input token is terminal a and parsing table entry M[A,a] contains production AX1X2 Xn then (* generate *) pop the parsing stack; for i:=n downto 1 do push Xi onto the parsing stack; else error; if the top of the parsing stack = $ and the next input token = $ then accept else error. The LL(1) parsing table for simplified grammar of if-statements: Statement if-stmt | other If-stmt if (exp) statement else-part Else-part else statement | Exp 0 | 1 M[N,T] Stateme nt If-stmt If Statement if-stmt If-stmt if (exp) statement else-part Other Else Stateme nt other 0 1 $

Else-part

Elsepart else statemen t Elsepart Exp 0 Exp 1

Elsepart

Exp

Notice: the entry M[else-part, else] contains two entries, i.e. the dangling else ambiguity. Disambiguating rule: always prefer the rule that generates the current look-ahead token over any other, and thus the production Else-part else statement over Else-part With this modification, the above table will become unambiguous, and the grammar can be parsed as if it were an LL(1) grammar The parsing actions for the string: If (0) if (1) other else other ( for conciseness, statement= S, if-stmt=I, else-part=L, exp=E, if=I, else=e, other=o) Steps 1 2 3 4 5 Parsing Stack $S $I $LS)E(i $ LS)E( $ LS)E Input i(0)i(1)oeo$ i(0)i(1)oeo$ i(0)i(1)oeo$ (0)i(1)oeo $ 0)i(1)oeo $ Action SI Ii(E)SL Match Match Eo Match Match SI Ii(E)SL Match Match E1 Match match So match LeS Match So match

L 22 $ $ accept

Unit 03 Parser
No ratings yet
Unit 03 Parser
148 pages
Sukomal Parsing Till MidSem25
No ratings yet
Sukomal Parsing Till MidSem25
78 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
Chellas ModalLogic
100% (8)
Chellas ModalLogic
305 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Lecture04 Week06 TopDownParsing 1 - Compilers
No ratings yet
Lecture04 Week06 TopDownParsing 1 - Compilers
48 pages
04 Cparsing
No ratings yet
04 Cparsing
26 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
4 - Top-Down
No ratings yet
4 - Top-Down
67 pages
Compiler 9
No ratings yet
Compiler 9
48 pages
Chapter 5 Intro To Top Down Parsing
No ratings yet
Chapter 5 Intro To Top Down Parsing
50 pages
Syntax
No ratings yet
Syntax
62 pages
Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
No ratings yet
Compiler Principle and Technology: Mr. Aruna Malik BIT (Mesra) Ranchi, Off Campus NOIDA
86 pages
Compiler Design: Parsing Techniques
No ratings yet
Compiler Design: Parsing Techniques
28 pages
Cheat Sheet Updated
No ratings yet
Cheat Sheet Updated
2 pages
Operator Precedence and LL Parsing
No ratings yet
Operator Precedence and LL Parsing
31 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
Top Down Parsing
No ratings yet
Top Down Parsing
27 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
Chapter - 3
No ratings yet
Chapter - 3
46 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
04 Syntax Analysis - RDP
No ratings yet
04 Syntax Analysis - RDP
28 pages
Top Down
No ratings yet
Top Down
25 pages
Lecture 7 (Slide)
No ratings yet
Lecture 7 (Slide)
14 pages
Unit III
No ratings yet
Unit III
29 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
4 Parsing
No ratings yet
4 Parsing
55 pages
Parsing Technique Baar Baar
No ratings yet
Parsing Technique Baar Baar
29 pages
Recursive Descent Parser - Wikipedia
No ratings yet
Recursive Descent Parser - Wikipedia
5 pages
Top Down Translation
No ratings yet
Top Down Translation
96 pages
Top-Down Parsing PDF
No ratings yet
Top-Down Parsing PDF
6 pages
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
No ratings yet
Unit-2 2.1. Review of CFG Ambiguity of Grammars 2.1.1. Limitations of Regular Language
44 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
No ratings yet
What Is Parsing: Parsing Is The Process of Analyzing An Input Sequence in Order
9 pages
Chapter 6 Homework
No ratings yet
Chapter 6 Homework
7 pages
Elimination of Left Recursion
No ratings yet
Elimination of Left Recursion
17 pages
Syntax Analysis
No ratings yet
Syntax Analysis
115 pages
cs212 Lect05 63 Inter
No ratings yet
cs212 Lect05 63 Inter
48 pages
Top Down Parsing
No ratings yet
Top Down Parsing
37 pages
Chapter Three
No ratings yet
Chapter Three
70 pages
Research Philosophy for Students
No ratings yet
Research Philosophy for Students
4 pages
Ch02 Programming Language Syntax 4e 2
No ratings yet
Ch02 Programming Language Syntax 4e 2
64 pages
Compiler Design Question Papers
No ratings yet
Compiler Design Question Papers
6 pages
Unit 2
No ratings yet
Unit 2
30 pages
Brilliance College: Kerala PSC Last Grade Answers - 10/07/2010 (Palakkad)
No ratings yet
Brilliance College: Kerala PSC Last Grade Answers - 10/07/2010 (Palakkad)
1 page
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
82 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
1008061281085599other Types
No ratings yet
1008061281085599other Types
1 page
Atcd Unit 2
No ratings yet
Atcd Unit 2
28 pages
Parsing Techniques Explained
No ratings yet
Parsing Techniques Explained
12 pages
Mathematics and Symbolic Logic (Students Material & Assignment)
100% (1)
Mathematics and Symbolic Logic (Students Material & Assignment)
15 pages
L13Parsing 5 PDF
No ratings yet
L13Parsing 5 PDF
25 pages
Parsing Techniques Explained
No ratings yet
Parsing Techniques Explained
48 pages
Genmath - Las - Week 7-8
No ratings yet
Genmath - Las - Week 7-8
4 pages
APTET 2013: Key FAQs for Candidates
No ratings yet
APTET 2013: Key FAQs for Candidates
2 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Parsing, Lexical Analysis, and Tools: William Cook
No ratings yet
Parsing, Lexical Analysis, and Tools: William Cook
16 pages
FRM Download Document Pop Up
33% (3)
FRM Download Document Pop Up
76 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Top-Down Parsing Predictive Parsing
No ratings yet
Top-Down Parsing Predictive Parsing
4 pages
FAFL
No ratings yet
FAFL
136 pages
Lec 09-Left Recursion Removal
No ratings yet
Lec 09-Left Recursion Removal
23 pages
Recap: Mooly Sagiv
No ratings yet
Recap: Mooly Sagiv
42 pages
Sachin Tendulkar
No ratings yet
Sachin Tendulkar
6 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Syn Flood Program in Python Using Raw Sockets (Linux)
No ratings yet
Syn Flood Program in Python Using Raw Sockets (Linux)
4 pages
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
No ratings yet
Predictive Parsing: Recall The Main Idea of Top-Down Parsing
19 pages
EEX 6335 TMA 2 Answer
No ratings yet
EEX 6335 TMA 2 Answer
10 pages
Predictive Parsing Techniques
No ratings yet
Predictive Parsing Techniques
19 pages
Theory of Computation
No ratings yet
Theory of Computation
10 pages
QB Cia 1
No ratings yet
QB Cia 1
7 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Interactive C Function Notes: Alloff Alloff
No ratings yet
Interactive C Function Notes: Alloff Alloff
6 pages
Number Theory 08 Paper
No ratings yet
Number Theory 08 Paper
3 pages
Iptables
No ratings yet
Iptables
10 pages
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
No ratings yet
Unit - Ii Topdown Parsing 1. Context-Free Grammars: Definition
26 pages
Ip Security (Ipsec)
No ratings yet
Ip Security (Ipsec)
18 pages
Logic and Truth Tables
No ratings yet
Logic and Truth Tables
42 pages
Cluster Papers-III B.SC
No ratings yet
Cluster Papers-III B.SC
1 page
Lecture Notes 10: Non Context-Free Languages: CS340: Theory of Computation
No ratings yet
Lecture Notes 10: Non Context-Free Languages: CS340: Theory of Computation
4 pages
Lecture 07
No ratings yet
Lecture 07
35 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
Chapter 3 4
100% (1)
Chapter 3 4
52 pages
Microprocessor Question Bank
No ratings yet
Microprocessor Question Bank
67 pages
Intro To KBS and Knowledge Representation
100% (1)
Intro To KBS and Knowledge Representation
18 pages
Discrete Mathematics
100% (1)
Discrete Mathematics
64 pages
Using Router Stamping To Identify The Source of IP Packets: 1.1 Current Defenses
No ratings yet
Using Router Stamping To Identify The Source of IP Packets: 1.1 Current Defenses
6 pages
Sampling
No ratings yet
Sampling
5 pages
1008061281081522Unit1-New Types of Questions
No ratings yet
1008061281081522Unit1-New Types of Questions
2 pages
2013-14 I B.Tech-I Sem: S.no Unit Topic Contents To Be Covered Objective Application
No ratings yet
2013-14 I B.Tech-I Sem: S.no Unit Topic Contents To Be Covered Objective Application
20 pages
Linux Iptables Avoid IP Spoofing and Bad Addresses Attacks
No ratings yet
Linux Iptables Avoid IP Spoofing and Bad Addresses Attacks
10 pages
Eserve Ank of Ndia: Onetary Olicy Unction
No ratings yet
Eserve Ank of Ndia: Onetary Olicy Unction
12 pages
Research Approach
No ratings yet
Research Approach
5 pages
Answers to Exercises in Mathematical Logic
No ratings yet
Answers to Exercises in Mathematical Logic
14 pages
Functions in C
No ratings yet
Functions in C
23 pages
Functions in C
No ratings yet
Functions in C
23 pages
CFG Solutions
No ratings yet
CFG Solutions
5 pages
CS 6660 Compiler Design
No ratings yet
CS 6660 Compiler Design
7 pages
AI Assignment
No ratings yet
AI Assignment
4 pages
Predicate Logic
100% (1)
Predicate Logic
40 pages
DDoS MidSubmission
No ratings yet
DDoS MidSubmission
53 pages
Wa0002.
No ratings yet
Wa0002.
11 pages
Compiler Design Concepts
No ratings yet
Compiler Design Concepts
3 pages
MATH 4 The Conditional and The Biconditional Handout
No ratings yet
MATH 4 The Conditional and The Biconditional Handout
3 pages
8-Module - 4 Full Content ppt-29-Aug-2020Material - I - 29-Aug-2020 - Unit-3-KR
No ratings yet
8-Module - 4 Full Content ppt-29-Aug-2020Material - I - 29-Aug-2020 - Unit-3-KR
117 pages
Week 3 Lecture Predicates and Quantifiers
No ratings yet
Week 3 Lecture Predicates and Quantifiers
71 pages
CS154 Midterm Exam May 2010
No ratings yet
CS154 Midterm Exam May 2010
7 pages
Literature Review Completed
No ratings yet
Literature Review Completed
7 pages
200S Practice CS143 Midterm Solutions
No ratings yet
200S Practice CS143 Midterm Solutions
7 pages
Lecture 10 - Chomsky Normal Form
No ratings yet
Lecture 10 - Chomsky Normal Form
75 pages
Reader
No ratings yet
Reader
29 pages
Introduction To Shift Reduce Parsing
No ratings yet
Introduction To Shift Reduce Parsing
94 pages