Compiler Construction
Lecture 4-5: Lexical errors
and
BITS Pilani
Hyderabad Campus
Parsing
Lexical Error
BITS Pilani, Hyderabad Campus
Lexical Error
BITS Pilani, Hyderabad Campus
Lexical Error
BITS Pilani, Hyderabad Campus
Lexical Error
BITS Pilani, Hyderabad Campus
Lexical Error
BITS Pilani, Hyderabad Campus
Problems with panic mode
recovery
➢ Examples:
➢ “charr” can be corrected by “char” by deleting “r”
➢ “cha” can be corrected by inserting “r”
➢ “whiel” can be corrected to “while” by the transpose method.
➢ “chrr” can be corrected by replacing “r” with “a”
BITS Pilani, Hyderabad Campus
Problems with panic mode
recovery
➢ The meaning of the program may be changes.
➢ The whole input may get deleted in the process.
➢ Error recovery must accurate, precise, fast and must not lead to
error cascade.
➢ Example:
int main()
{
int x y;
………..
}
BITS Pilani, Hyderabad Campus
Parsing
BITS Pilani, Hyderabad Campus
NEED AND ROLE OF SYNTAX ANALYZER (Parser)
1. The parser obtains a string of tokens from the lexical analyzer and
verifies that the string of token names can be generated by the
grammar for the source language.
2. We expect the parser to report any syntax errors in an intelligible
fashion and to recover from commonly occurring errors to continue
processing the remainder of the program.
BITS Pilani, Hyderabad Campus
How to describe language syntax
➢ Regular Expressions ??
➢ Context-free grammars
➢ Captures syntax structures of programming languages
➢ Cannot handle the problems:
➢To check whether variables are of types on which
operations are allowed.
➢To check whether a variable has been declared before use.
➢To check whether a variable has been initialized.
BITS Pilani, Hyderabad Campus
Syntax definition
BITS Pilani, Hyderabad Campus
Example - 1
BITS Pilani, Hyderabad Campus
Example - 2
BITS Pilani, Hyderabad Campus
Some important Terms
➢ Parse tree / syntax tree
➢ Letmost derivation and Rightmost derivation
➢ Ambiguous grammars
➢ Dangling else problem
➢ stmt -> if expr then stmt | if expr then stmt else stmt
➢ String: if e1 then if e2 then s1 else s2
BITS Pilani, Hyderabad Campus
Cont..
Dangling else problem
stmt -> if expr then stmt | if expr then stmt else stmt
String: if e1 then if e2 then s1 else s2
BITS Pilani, Hyderabad Campus
Grammar transformations
• Since it is sometimes convenient to have a grammar that
satisfies a particular property for a language, we would like to
be able to transform grammars into other grammars that
generate the same language, but that possibly satisfy different
properties.
• Forms of grammar transformations:
• Removing unreachable variables.
• Removing Left factoring.
• Removing left recursion.
BITS Pilani, Hyderabad Campus
Left factoring
• Left factoring is a grammar transformation that is applicable
when two productions left factoring for the same nonterminal
start with the same sequence of (terminal and/or
nonterminal) symbols.
A → xy | xz | v
may be transformed into
A → xA’ | v
A’ → y | z where Z is a new nonterminal.
In general, find the longest prefix common to two or more of its alternatives.
BITS Pilani, Hyderabad Campus
Example
Left factor the following grammar
1) S -> aSSbS | aSaSb | abb | b
2) S -> bSSaaS | bSSaSb | bSb | a
BITS Pilani, Hyderabad Campus
Left recursion
Left recursive Grammar
A -> A α | β (left most symbol in RHS is equal to the symbol in RHS)
A A
A
A α
β A α
Since the variable
A α A calls A without doing
β anything this might lead to
an infinite loop
Language generated by this grammar is βα* A α
BITS Pilani, Hyderabad Campus
Left Recursion Elimination
BITS Pilani, Hyderabad Campus
Example
BITS Pilani, Hyderabad Campus
Left Recursion general case
To eliminate left recursion (general form)
A → A1 | A2 | ... | Am | 1 | 2 | ... | n
with
A → 1A’ | 2A’ | ... | nA’
A’ → 1A’ | 2A’ | ... | m A’ |
BITS Pilani, Hyderabad Campus
Left Recursion
➢ There are three types of left recursion:
➢direct (A → A x)
➢indirect (A → B C, B → A )
➢hidden (A → B A, B → )
BITS Pilani, Hyderabad Campus
Eliminating indirect Left
Recursion
Consider the grammar:
S → Aa
A → Sb | c
Here the grammar does not have direct left recursion .. but
has a left recursion because S =>Aa => Sba
Rewrite the above grammar,
S → Aa
A → Aab|c
Replace S with Aa in the
second production.
BITS Pilani, Hyderabad Campus
Eliminating indirect left
recursion
ordering: S, E, T, F i=S i=E i=T, j=E
S→E S→E S→E S→E
E → E+T E → E+T E → TE' E → TE'
E→T E→T E'→+TE'| E'→+TE'|
T → E-T T → E-T T → E-T T → TE'-T
T→F T→F T→F T→F
F → E*F F → E*F F → E*F F → E*F
F → id F → id F → id F → id
Algorithm for eliminating indirect recursion
S→E
List the nonterminals in some order A1, A2, ...,An
E → TE'
for i=1 to n
E'→+TE'|
for j=1 to i-1
T → FT'
if there is a production Ai→Aj,
T' → E'-TT'|
replace Aj with its rhs
F → E*F
eliminate any direct left recursion on Ai
F → id
BITS Pilani, Hyderabad Campus
Eliminating indirect left
recursion
i=F, j=E i=F, j=T
S→E S→E S→E
E → TE' E → TE' E → TE'
E'→+TE'| E'→+TE'| E'→+TE'|
T → FT' T → FT' T → FT'
T' → E'-TT'| T' → E'-TT'| T' → E'-TT'|
F → TE'*F F → FT'E'*F F → idF'
F → id F → id F' → T'E'*FF'|
BITS Pilani, Hyderabad Campus
Parsing methods
➢ Top-Down parsing
➢ Construction of the parse tree starts at the root (from the start
symbol) and proceeds towards leaves (token or terminals)
➢ Can be viewed as finding a leftmost derivation for an input string.
➢ Bottom-up parsing
➢ Construction of the parse tree starts from the leaf nodes and
proceeds towards root (start symbol).
➢ Order is that of the reverse of a rightmost derivation
BITS Pilani, Hyderabad Campus
Top down and bottom-up parsers
S -> aABe Input string: abbcde
A -> Abc | b
Top-Down parsing:
B -> d
➢ Can be viewed as finding a leftmost derivation for
an input string.
➢ Main Task is to make a decision to use the right
production for deriving the string
Bottom-up parsing:
➢ Order is that of the reverse of a rightmost derivation.
➢ Main Task is to make a decision of whether to shift or reduce.
BITS Pilani, Hyderabad Campus
Types of parsers
Parsers
Top Down Bottom Up
Operator LR
Backtracking Predictive
Precedence
Recursive
LL(1) LR(0) SLR(1) LALR(1) CLR(1)
Descent
BITS Pilani, Hyderabad Campus
Top-Down parsing
BITS Pilani, Hyderabad Campus
Top down parser
➢ Built from root to leaves.
➢ The derivation terminates when the required input string
terminates.
➢ Leftmost derivation matches this requirement.
➢ Main task is to find appropriate production rule in order
to produce the correct input string.
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
First input string matches
with the leftmost node, hence
Advance the input string pointer
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
Match next node P with current
Character in input string. It does
not match and P is non terminal
Hence expand.
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
y w
Match, hence advance
The input string pointer
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
y w
Mismatch
Hence backtrack
And use other
Production of P
BITS Pilani, Hyderabad Campus
Example - 1
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
BITS Pilani, Hyderabad Campus
Example
Grammar
Sentential form
# Production rule S
1 S -> x P z
2 P -> yw | y x P z
Input string x y z
Matching done for
entire string
BITS Pilani, Hyderabad Campus
Left Recursion
➢ Bad news:
➢ Top-down parsers cannot handle left recursion
➢ Good news:
➢ We can systematically eliminate left recursion
BITS Pilani, Hyderabad Campus
Backtracking parser
➢ Tries different production rules to find the match for the
input sting by backtracking each time.
➢ Slower and requires exponential time in general.
➢ Hence not preferred in practical compilers.
BITS Pilani, Hyderabad Campus
Predictive parsing
➢ The goal of predictive parsing is to construct a top-down
parser that never backtracks.
➢ To do so, we must transform a grammar in two ways:
➢ eliminate left recursion, and
➢ perform left factoring.
BITS Pilani, Hyderabad Campus
Thank you
BITS Pilani, Hyderabad Campus