Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
15 views32 pages

LRParsing LRParserGenerator

An LR(k) parser is a bottom-up parser that scans input from left to right and generates a right-most parse tree, requiring a lookahead of no more than k symbols. It utilizes a stack for handle recognition and employs a table-driven finite state machine to determine actions such as shift and reduce. The document also discusses the construction of parsing tables, the concept of viable prefixes, and various types of LR parsers including SLR and LALR parsers.

Uploaded by

rishi prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views32 pages

LRParsing LRParserGenerator

An LR(k) parser is a bottom-up parser that scans input from left to right and generates a right-most parse tree, requiring a lookahead of no more than k symbols. It utilizes a stack for handle recognition and employs a table-driven finite state machine to determine actions such as shift and reduce. The document also discusses the construction of parsing tables, the concept of viable prefixes, and various types of LR parsers including SLR and LALR parsers.

Uploaded by

rishi prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

LR Parsing and LR Parser

Generator
What is an LR(k) Parser
• An LR parser is a bottom-up parser
• Operates by scanning the input from left to
right (`L' for left-to-right) and generates:
– the right-most parse tree (`R' for right-most parse
tree)
– in a bottom-up fashion
• If parser needs to look ahead by no more than
k-symbols, we call it an LR(k) parser
An LR Parser

Same for ALL LR parsers

Differs from parser to parser


Handle
• Consider grammar
1. S → aABe
2. A → Abc
3. A→b
4. B→d
• Consider right-most production of abbcbcde
– S =>1 aABe =>4 aAde =>2 aAbcde =>2 aAbcbcde =>3
abbcbcde
• Right-most reduction follow above steps in the
reverse order
LR Parsing Theory
• Deals with how to make parser know exactly what to do at a particular
instance
• For this, an LR parser uses a stack to push symbols
• If first few symbols at t.o.s match the right side of some rule:
– these symbols are popped out from stack and the left side of the rule is pushed into the
stack
– this operation is called a reduction
– for example, if the stack is aAbc (where a is at bottom of stack)
• the parser will reduce using the rule A → Abc
• it will pop out Abc from the stack and push A
• the stack now becomes aA
• the sequence Abc is the handle at the time of reduction
• LR parser correctly identifies a handle on the top of the stack
– and replaces this handle in the stack with the left side of the rule
• We may also say that the parser has pruned the handle in the stack
– to the left side of the rule
• Every LR parser proceeds by carrying out a series of handle pruning
LR Parsing Theory
• Another name for LR parsers is shift-reduce parsers
• There are two fundamental actions:
– shift the current input token in the stack
• and read the next token
– reduce by some production rule
• Problem for the LR parser is when to shift and when to reduce
– if to reduce, by which rule
• Needs a recognizer for handles
– so that by scanning stack, can decide proper action
• Recognizer is actually a finite state machine (DFA)
– but language symbols include both terminals and non-terminals
LR Parsing Theory
• DFA corresponding to an LR parser can be table-driven
– However, it is slightly different from normal DFA
• Two parts of the DFA table, called the LR Parsing Table
– ACTION and GOTO
• For current state s and the current input a, ACTION[s][a]
can be:
– Shift-t: Push a and make t the next state
– Reduce by rule A → β
– Accept, signaling the end of successful parsing
– Error
• When the action is reduce:
– ACTION table does not tell what will be next state
– For that parser uses a different table GOTO
LR Parser Summary
• All LR parsers are shift-reduce parsers
• All LR parsers have identical driving routine
– the module of the parser that carries out shift or reduce
• The driving routine uses a stack to store a string of the form:
– s0X1s1X2s2 . . . Xmsm
• sm is on the top of stack
• symbols s0, s1, … etc., are states (of the DFA)
• X1, X2, ... etc., are grammar symbols
– (terminals or non-terminals)
• Driving routine uses parsing table to decide current action
• Parsing table is a DFA state table split in two parts
– ACTION and GOTO.
• There are some utilities that may be used to automatically
generate the parsing table for a grammar
– provided the grammar is indeed an LR(k) grammar of appropriate k
Example LR Parser for ETF Grammar
Rules:

1. E -> E + T
2. E -> T
3. T -> T * F
4. T -> F
5. F -> (E)
6. F -> id
Driving Routine
push(0);
read_next_token();
for(;;)
{
s = top(); /* current state is taken from top of stack */
if (ACTION[s,current_token] == "s-i") // shift and go to state i
{
push(current_token);
push(i);
read_next_token();
}
else if (ACTION[s,current_token] == "r-i") // reduce by rule i: X --> A1...An
{
perform pop() 2 * n times;
s = top();
push(X) // push the left hand side of chosen rule
push(GOTO[s,X]); // push state after reduction
// OUTPUT RULE
}
else if (ACTION[s,current_token] == "succ")
success!!
else error();
}
Parser Walk-through
• We will carry out LR paring of input:
– (a + b) * (c * (d + e))
– i.e., (id + id) * (id * (id + id))
• A configuration of an LR parser is the current
stack and the current unseen input:
– (s0X1s1X2s2 . . . Xmsm, aiai+1 . . . an$)
• The initial configuration is:
• (0, (id + id) * (id * (id + id)) $)
Some Observations
• An LR Parser “shift”-s till stack top is known to
contain the handle
• When stack top is known to contain the
handle, LR parser “reduce”-es
• So, the key aspect of LR Parser is to impart it
knowledge to “know” when a handle on the
top of stack
• Done using a DFA of Viable Prefixes
Viable Prefix
• A Viable Prefix is a prefix of a right-sentential form that
does NOT continue past the end of the rightmost handle of
that sentential form
• In the Right-most derivation’s last step
– E => T => T * F => T * (E) => T * (E + T)
• Highlighted the handle
• Hence possible Viable Prefixes are:
• T
• T*
• T*(
• T * (E
• T * (E +
• T * (E + T
• But NOT T * (E + T)
Simple LR or SLR Parser
• is a type of LR parser with small parse tables
and a relatively simple parser generator
algorithm
• quite efficient at finding the single correct
bottom-up parse in a single left-to-right scan
over the input stream, without guesswork or
backtracking
• mechanically generated from a formal
grammar for the language
LR(0) Item or “item”
• A Grammar Rule with a dot (.) at some position
on right hand side
• For a rule with |r.h.s| = n, there are n + 1 items
• Example:
– For rule E → E + T, the items are:
a) E→.E+T
b) E→E.+T
c) E→E+.T
d) E→E+T.
• An item loosely signify:
– how much of the rule has been seen by the parser
Augmented Grammar
• If S is the start-symbol of a grammar
• Augment the grammar with ONE extra rule
• S’ → S
• Hence the item S’ → . S means parsing has not
started
• And item S’→ S. means parsing is over (with
success)
Closure of a Set of Items
• Definition: For any set of items I, CLOSURE(I)
is formed as follows:
– Initialize CLOSURE(I) = I
– If A → α · B β is in CLOSURE(I) and B → γ is a
production, then add B → · γ to the closure and
repeat
• Why
– S =>* δ α B β φ =>* δ α γ β φ
The Set of Items I0
• From Augmented Grammar (having used
S’→ S):
– I0 = CLOSURE({S’→ . S})
• Thus, in ETF Grammar,
= { E' → ·E,
E → . E + T,
E → .T,
T → . T * F,
T → . F,
F → . ( E ),
F → . id }
GOTO(I,X)
• Definition: If I is a set of items and X is a
grammar symbol, then GOTO(I,X) is the
CLOSURE of the set of items A→α X . β
where A→α . X β is in I
• Example:
– GOTO(I0,E) = {E’→E ., E’→E . + T } = I1 (say)
– GOTO(I0,() = {F → (. E ), E → . E + T, E → .T,
T → . T * F, T → . F, F → . (E ), F → . id }
= I4 (as we will see later)
Make X a State Transition from a state
I to state J where J = GOTO(I, X)

I J
A Canonical Collection of LR(0) Items
• Start with Augmented Grammar and then I0
• Create DFA
• Make I0 the Initial State of the DFA

• Theorem of LR Parsing (without proof):


– ANY string traced out by travelling from I0 to any
other state traces out a viable prefix of the grammar
– i.e., the set of all viable prefixes of all the right
sentential forms of a grammar is a regular language
Canonical Collection for ETF Grammar
Construction of SLR Parsing Table
a
I J ACTION[State-I, a] = shift-J

A
I J GOTO[State-I, A] = J

.
.
A→α . ACTION[State-I, b] = reduce A→α
. For ALL a in FOLLOW(A)
.

S’→S . ACTION[State-I, S] = acc


Summary of Process
• Form Augmented Grammar
• Generate Canonical Collection of LR(0) items
• Construct SLR Parsing Table with above
• Fill up blank entries with appropriate error
messages

• Grammar is not SLR if any entry has duplicates


Are ALL LR Parsers SLR?
• NO
Conflicts
• Shift-reduce and reduce-reduce
• Example Grammar:
– S→L=R|R
– L → * R | id
– R→L
• We will have I2 = { S → L . = R, R → L . }
– Remember ‘=’ is in FOLLOW(R)

ACTION[2, ‘=‘] => Shift ACTION[2, ‘=‘] => Reduce(R → L)


Lookaheads → LR(1) Items
• Sometimes conflicts can be avoided with ‘lookaheads’
• LR(1) Item: [A → α · β , a], where a is terminal or ‘$’
• Is valid for viable prefix γ if:
– S ➔* δ A φ ➔ δ α β φ, where
• γ=δα
and
• a is either the first symbol of φ or φ is empty and a is ‘$’
• CLOSURE and GOTO are appropriately defined
• If [A → α · , a] (A ≠ S’) is in state Ii:
– ACTION[i, a] = reduce(A → α )
Example
• Augmented Grammar:
– S’→ S
– S→CC
– C→cC|d
LALR parsing Table
• Merge States. Example, we merge I3 and
I6:
– I36 = {
[C → c · C, c/d/$],
[C → · c C, c/d/$],
[C → · d, c/d/$]
}
• Merge rows of merged states in Parsing
Table
• Same number of states as will be in SLR
• So, smaller table but lookahead built-in
• Can also incorporate precedence and
associativity
Using Ambiguous Grammars
• The ‘E’ grammar:
– E → E + E | E * E | ( E ) | id

• Less states
• Many conflicts likely
• Impose precedence
and associativity
externally

You might also like