Lecture 3: Context-free Language, Grammar and
Push Down Automata
CSM 3125: Theory of Computation
Level 3 Semester 1, 2024
Bioinformatics Engineering
Md. Saif Uddin
Lecturer
Department of Computer Science and Mathematics
Bangladesh Agricultural University
[email protected]Theory of Computation Lecture 3: CFL, CFG, and PDA 1
Context-Free Languages
• Context-Free Grammar** → Context-Free Languages → Push Down
Automata**
• A Context-Free Language (CFL) is a set of strings that can be
generated by a context-free grammar (CFG).
• CFLs are recognized by pushdown automata, which are
computational models that use a stack to keep track of the input.
• CFLs are more powerful than regular languages and can describe
nested structures like balanced parentheses, which regular languages
cannot handle.
• Examples:
▪ L = { aⁿbⁿ | n ≥ 0 }.
▪ L = { w ∈ {a,b} | w is a palindrome }*
Cannot be recognized by a finite automaton but can be recognized by a PDA
using a stack.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 2
Context-Free Languages
Why named as “Context Free”?
• the production rules in a context-free grammar can be applied
regardless of the "context" surrounding the non-terminal symbol being
rewritten.
• This contrasts with context-sensitive grammars, where the replacement
of a non-terminal can depend on its surrounding symbols.
• Example: If you have a rule like
A → aB
it can be applied wherever A appears, whether it's in the string xAy or zAw.
The x, y, z, and w symbols don't affect the rule's application.
• In contrast, context-sensitive grammars have rules where the
replacement of a non-terminal can depend on the surrounding
symbols. For example, a rule might be xAy → xBBy, meaning A can only
be replaced by BB when it's surrounded by x and y.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 3
Context-Free Languages
Real-Life Examples of CFLs
CFLs are useful because they can describe many real-world structures,
such as:
Language Description Example Strings
Balanced parentheses (), ((())), (()())
Arithmetic expressions a+a, a+a*a, a+(a*a)
Simple HTML/XML tags <b>text</b>
Palindromes (even length) abba, ccddcc
Matching number of as and bs aabb, aaabbb
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 4
Context-Free Languages
Regular Languages (RL) vs. Context-Free Languages (CFL)
Context-Free Languages
Feature Regular Languages (RL)
(CFL)
Regular Expressions / Finite Context-Free Grammars (CFG)
Defined by
Automata / Pushdown Automata (PDA)
Pushdown Automaton (PDA)
Machine Recognizer Finite Automaton (FA)
(FA + stack)
Memory No memory Stack memory (LIFO)
Can count and match No (only patterns like "a*" or Yes (e.g., equal number of a's
symbols? "ab*") and b's)
aⁿbⁿ, balanced parentheses
Examples a*b*, (ab)*, 0110*
(()())
Union, Intersection, Union, Concatenation, Star
Closure under operations Complement, Concatenation, Not closed under Intersection
Star and Complement
Tokenization in compilers, Parsing expressions,
Common use
lexical analysis programming languages
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 5
Push Down Automata
• A pushdown automaton is
a way to implement a
context-free grammar in a
similar way we design DFA
for a regular grammar.
• It is more powerful than
Finite Automata.
• PDA = FA + 1 Stack
• Key Components:
▪ Finite Control Unit: Similar to a finite automaton, it has a finite number of
states and transitions between them based on input and stack
operations.
▪ Input Tape: A read-only tape containing the input string to be processed.
▪ Stack: An auxiliary memory with a LIFO (Last-In, First-Out) structure,
allowing for pushing and popping of symbols
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 6
Push Down Automata
DPDA(Deterministic Pushdown NPDA(Non-deterministic Pushdown
Automata) Automata)
It is less powerful than NPDA.
It is more powerful than DPDA.
Example:
Example:
We can only construct DPDA for odd-
NPDA can be constructed for both even-
length palindromes and not for even
length and odd-length palindromes.
length palindromes.
It is possible to convert every DPDA to a It is not possible to convert every NPDA
corresponding NPDA. to a corresponding DPDA.
The language accepted by DPDA is a The language accepted by NPDA is not a
subset of the language accepted by subset of the language accepted by
NPDA. DPDA.
There is only one state transition from There may or maynot be more than one
one state to another state for an input state transition from one state to another
symbol. state for same input symbol.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 7
Push Down Automata
Formal Definition of Pushdown Automata
A Pushdown Automata (PDA) can be defined as 7 tuples:
P = (Q, ∑, Γ, q0, z0, F, δ)
• Q is the set of states
• ∑ is the set of input symbols
• Γ is the set of pushdown symbols (which can be pushed and popped
from the stack)
• q0 is the initial state
• z0 is the initial pushdown symbol (which is initially present in the stack)
• F is the set of final states
• δ is a transition function that maps Q x {Σ ∪ ∈} x Γ into Q x Γ*. In a given
state, the PDA will read the input symbol and stack symbol (top of the
stack) move to a new state, and change the symbol of the stack.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 8
Push Down Automata
The output of δ is finite set of pairs (Q, Γ*.) where:
Q is a new state and Γ* is a string of stack symbols that replaces X at the
top of the stack
e.g. If y = ∈ then the stack is popped
If y = X then the stack is unchanged
If y = YZ then X is replaced by Z and Y is pushed onto the stack
Instantaneous Description (ID)
Instantaneous Description (ID) is an informal notation of how a PDA
computes an input string and makes a decision whether that string is
accepted or rejected.
An ID is a triple (q, w, α), where:
1. q is the current state.
2. w is the remaining input.
3. α is the stack contents, top at the left.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 9
Push Down Automata
Operations of PDA:
PUSH: POP: SKIP:
a b b a b b a b b
(a, z0 / az0) (a, a / ∈) (a, z0 / z0)
q1 q2 q1 q2 q1 q2
δ (q1, a, z0) = (q2, az0) δ (q1, a, a) = (q2, ∈) δ (q1, a, z0) = (q2, z0)
a
a b b
z0 z0 z0 z0 z0 z0
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 10
Push Down Automata
Example : Define the pushdown automata for language {anbn | n > 0}
Solution : M = where Q = { q0, q1 } and Σ = { a, b } and Γ = { A, Z } and δ is
given by :
State Diagram:
IDs:
δ( q0, a, Z ) = ( q0, AZ )
δ( q0, a, A) = ( q0, AA )
δ( q0, b, A) = ( q1, ∈)
δ( q1, b, A) = ( q1, ∈)
δ( q1, ∈, Z) = ( q1, ∈)
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 11
Push Down Automata
Let us see how this automata works for aaabbb.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 12
Push Down Automata
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 13
Push Down Automata
Examples:
• L = {anb2n | n>=1}
• L = {anbncm | n, m>=1}
• L = {anbmcn | n, m>=1}
• L = {an+mbncm | n, m>=1}
• L = {anbn+mcm | n, m>=1}
• L = {anbmcn+m | n, m>=1}
• L = {n0=n1 | w ∈ (0, 1)*}
• L = {wcwR | w ∈ (a,b)*}
• L = {wwR | w ∈ (a,b)+} (NPDA)
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 14
Context-Free Grammar
• A context-free grammar (CFG) is a formal system used to describe a
class of languages known as context-free languages (CFLs).
• Purpose of context-free grammar is:
✓ To list all strings in a language using a set of rules (production
rules).
✓ It extends the capabilities of regular expressions and finite
automata.
• A Context-free grammar defines a language by specifying a set of rules
for generating it’s valid sentences
• It allows to apply the production rule without regard to the context in
which those sentences might appear.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 15
Context-Free Grammar
Formal Definition:
Context-Free Grammar is defined by 4 tuples as
G = (V, Σ / T, P, S)
Where,
V= Set of Variables or Non-Terminal symbols
Σ / T = Set of Terminal symbols
P = Set of Production Rules
S = Start symbol (a distinguished nonterminal symbol)
Context-Free Grammar has production rule of the form
A→a
where, A ∈ V and a = {V ∪ Σ}*
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 16
Context-Free Grammar
❑ Example:
For generating a language that generates equal number of a’s and b’s in
the form of anbn, the Context-free grammar will be defines as
G = (V, Σ, P, S)
Where,
V = {S, A}
Σ = {a, b}
S=S
P=
{ S → aAb ,
A → aAb | ∈ }
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 17
Derivation of Strings from a Grammar
The set of all strings that can be derived from a grammar is said to be the
Language generated from that Grammar.
❑ Example: Consider the grammar
G = ( {S, A}, {a, b}, {S → aAb, A → aAb | ∈}, S )
Derivation for “a3b3”:
S → aAb (by applying rule S → aAb)
S → aaAbb (by applying rule A → aAb)
S → aaaAbbb (by applying rule A → aAb)
S → aaabbb (by applying rule A → ∈)
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 18
Derivation of Strings from a Grammar
The set of all strings that can be derived from a grammar is said to be the
Language generated from that Grammar.
❑ Exercise: Consider the grammar
G = ( {S, C}, {a, b, i, +}, {S → iC + S | iC + SeS | a, c → b}, S )
Derivation for “ib + ib = aea”:
S→
.
.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 19
Left-Most Derivation (LMD)
A LMD is obtained by starting with start symbol and applying production
rule to the leftmost variable in each steps.
❑ Example: A Left-Most Derivation of the following grammar for
generating the sting “aab”:
G = ( {S}, {a, b {S → aS | aSbS | ∈}, S )
LMD for “aab”:
S → aS
S → aaSbS (S → aSbS)
S → aa∈bS (S → ∈)
S → aab∈ (S → ∈)
S → aab
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 20
Right-Most Derivation (LMD)
A RMD is obtained by starting with start symbol and applying production
rule to the leftmost variable in each steps.
❑ Example: A Right-Most Derivation of the following grammar for
generating the sting “aab”:
G = ( {S}, {a, b {S → aS | aSbS | ∈}, S )
RMD for “aab”:
S → aSbS
S → aSb∈ (S → ∈)
S → aaSb (S → aS)
S → aa∈b (S → ∈)
S → aab
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 21
Derivation / Parse Tree
A Derivation Tree or Parse Tree is an ordered rooted tree that graphically
represents the semantic information of strings derived from a Context
Free Grammar.
• Root Vertex: Must be labelled by the Start Symbol
• Vertex: Labelled by Non-Terminal Symbols
• Leaves: Labelled by Terminal Symbols or e
Example: For the Grammar
G = (V, T, P, S) where S → OB, A → 1AA | ∈, B → OAA
Show the derivation tree for generating the string “001”
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 22
Ambiguous Grammar
• A grammar is said to be ambiguous if for any string generated by it, it
produces more than one parse tree or derivation tree or leftmost
derivation, or rightmost derivation.
• This ambiguity makes it difficult for a parser to decide the correct
syntactic structure of a string
• Ambiguity is property of grammar not language.
• Example: G=({S}, {a, b, +, *}, P, S) where P consists of S→S+S | S*S | a |
b. The String a + a * b can be generated as:
S→S+S S→S*S
S→a+S S→S+S*S
S→a+S*S S→a+S*S
S→a+a*S S→a+a*S
S→a+a*b S→a+a*b
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 23
Unambiguous Grammar
• An unambiguous grammar is a context-free grammar (CFG) in which
every valid string derived from the grammar has exactly one parse
tree, one leftmost derivation, and one rightmost derivation.
• This ensures that the structure of the string is always clear and
uniquely determined by the grammar.
• Examples:
S → AB
A → aA | b
b → bB | a
Generates the string abba.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 24
Unambiguous Grammar
• An unambiguous grammar is a context-free grammar (CFG) in which
every valid string derived from the grammar has exactly one parse
tree, one leftmost derivation, and one rightmost derivation.
• This ensures that the structure of the string is always clear and
uniquely determined by the grammar.
• Examples:
S → AB
A → aA | b
b → bB | a
Generates the string abba.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 25
Simplification of Context Free Grammar
In CFG, sometimes all the production rules and symbols are not needed
for the derivation of strings. Besides this, there may also be some NULL
Productions and UNIT Productions. Elimination of these productions and
symbols is called Simplification of CFG.
Simplification consists of the following steps:
1) Removal of Null Productions
2) Removal of Unit Productions
3) Reduction of CFG
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 26
Removal of Null Productions
In a CFG, a Non-Terminal Symbol 'A' is a nullable variable if there is a
production
A→ε
or there is a derivation that starts at 'A' and leads to ε. (Like A → .......... ε)
Procedure for Removal:
Step 1: To remove A → ε, look for all productions whose right side
contains A.
Step2: Replace each occurrences of 'A' in each of these productions with
ε.
Step 3: Add the resultant productions to the Grammar.
Step 4: If start is nullable, add S′ → S ∣ ε.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 27
Removal of Null Productions
Example: Remove Null Productions from the following Grammar
S → ABAC, A → aA | ε, B → bB | ε, C → c
Null productions are:
A → ε, B → ε
1) To eliminate A → ε:
S → ABAC
S → ABC | BAC | BC
A →aA
A→ a
New production:
S → ABACI ABC | BAC | BC, A → Aa, A → a, B → bB | ε, C → c
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 28
Removal of Null Productions
2) To eliminate B → ε:
S → AAC | AC| C, B → bB, B → b
New production:
S → ABACI ABC | BAC | BC | AAC | AC | C
A → Aa, A → a
B → bB | b
C→c
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 29
Removal of Unit Productions
Any Production Rule of the form A → B where A, B ∈ Non Terminals is
called Unit Production
Procedure for Removal
Step 1: To remove A → B, add production A → x to the grammar rule
whenever B → x occurs in the grammar. [ x ∈ Terminal, x can be Null]
Step 2: Delete A → B from the grammar.
Step 3: Repeat from Step 1 until all Unit Productions are removed.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 30
Removal of Unit Productions
Any Production Rule of the form A → B where A, B ∈ Non Terminals is
called Unit Production
Procedure for Removal
Step 1: To remove A → B, add production A → x to the grammar rule
whenever B → x occurs in the grammar. [ x ∈ Terminal, x can be Null]
Step 2: Delete A → B from the grammar.
Step 3: Repeat from Step 1 until all Unit Productions are removed.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 31
Removal of Unit Productions
Example: Remove Unit Productions from the Grammar whose
production rule is given by
P: S →XY, X → a, Y → Z | b, Z → M, M → N, N → a
Unit productions are:
Y → Z, Z → W, M → N
1. Since N → a, we add M → a
P: S → XY, X → a , Y → Z | b, Z → M, M → a, N → a
2. Since M → a, we add Z → a
P: S → XY, X → a , Y → Z | b, Z → a, M → a, N → a
3. Since Z → a, we add Y → a
P: S → XY, X → a , Y → a | b, Z → a, M → a, N → a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 8/10
32
Reduction of CFG
CFG are reduced in two phases
Phase 1:
Derivation of an equivalent grammar G', from the CFG, G, such that each
variable derives some terminal string
Derivation Procedure:
Step 1: Include all Symbols W₁, that derives some terminal and initialize i
=1
Step 2: Include symbols Wi + 1 , that derives Wi
Step 3: Increment i and repeat Step 2, until Wi + 1 =Wi
Step 4: Include all production rules that have Wi in it
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 33
Reduction of CFG
Phase 2:
Derivation of an equivalent grammar G', from the CFG, G', such that each
symbol appears in a sentential form.
Derivation Procedure:
Step 1: Include the Start Symbol in Y₁ and initialize i =1
Step 2: Include all symbols Yi+1, that can be derived from Y, and include
all production rules that have been applied
Step 3: Increment i and repeat Step 2, until Yi+1 = Yi
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 34
Reduction of CFG
Example:
Find a reduced grammar equivalent to the grammar G, having production
rules
P: S → AC | B, A → a, C → c | BC, E→ aA | e
Phase 1:
T = {a,c,e}
W1 = {A, C, E}
W2= {A, C, E, S}
W3 = {A, C, E, S}
G’ =( {A,C,E,S}, {a,c,e}, P, (S)}
P: S → AC, A → a, C → c, E → aA | e
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 35
Reduction of CFG
Phase 2:
Y1 = {S}
Y2 = {S, A, C}
Y3 = {S, A,C, a, c}
Y4 = {S, A, C, a, с}
G’’ = ({A, C, S}, {a, c}, P, {S})
P: S → AC, A → a, C → c
This is the reduced grammar.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 36
Chomsky Normal Form (CNF)
A CFG is in Chomsky Normal Form if every production is of one of
these types:
1. Binary rule:
A → BC
where A, B, C are non-terminals (and B, C are not the start symbol).
2. Terminal rule:
A→a
where a is a terminal.
3. (Optional) Start symbol:
If the language includes ε, then only the start symbol S can produce
ε:
S→ε
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 37
Chomsky Normal Form (CNF)
Why convert to CNF?
• To prove properties about context-free languages (like closure,
decidability).
• To implement parsers (e.g., CYK uses CNF to check if a string
belongs to a language in O(n3) time).
• To analyze grammars formally (CNF is easier for mathematical
manipulation).
• CNF makes a grammar simple, uniform, and algorithm-friendly.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 38
CFG to CNF
Steps to convert a given CFG to Chomsky Normal Form:
Step 1: If the Start Symbol S occurs on some right side, create a new
Start Symbol S' and a new Production S’→S.
Step 2: Remove Null Productions. (Using the Null Production Removal
Step 3: Remove Unit Productions. (Using the Unit Production Removal
Step 4: Replace each Production A → B₁.........Bn where n > 2, with A → B₁
C where C → B2.....Βn Repeat this step for all Productions having two or
more Symbols on the right side.
Step 5: If the right side of any Production is in the form A → aB where 'a' is
a terminal and A and B are non-terminals, then the Production is
replaced by A → XB and X → a. Repeat this step for every Production
which is of the form A → aB.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 39
CFG to CNF
Example: Convert the following CFG to CNF:
S → ASA | aB
A → B| S | ε
B→b
1) Since S appears in RHS, we add a new State S' and S’ → S is added to
the production
S’ → S
S → ASA | aB
A →B| S
B→b|ε
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 40
CFG to CNF
2) Remove the Null Productions:
After Removing B → ε :
S’ → S,
S → ASA | aB | a,
A → B| S | ε,
B→b
After Removing A → ε :
S’ → S
S → ASA | aB | a | AS | SA | S
A → B| S
B→b
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 41
CFG to CNF
3) Remove the Unit Productions:
After Removing S → S:
S’ → S
S→ASA|aB|a|AS|SA
A→B|S
B→b
After Removing S’ → S:
S’ → ASA|aB|a|AS|SA
S→ASA|aB|a|AS|SA
A→B|S
B→b
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 42
CFG to CNF
After Removing A → B:
S’ → ASA|aB|a|AS|SA
S → ASA|aB|a|AS|SA
A → b|S
B→b
After Removing A → S:
S’ → ASA|aB|a|AS|SA
S → ASA|aB|a|AS|SA
A → b|ASA|aB|a|AS|SA
B→b
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 43
CFG to CNF
4) Now find out the productions that has more than TWO variables in
RHS
S’ → ASA, S → ASA and A → ASA
After removing these, we get:
S’ → AX|aB|a|AS|SA
S → AX|aB|a|AS SA
A → b|AX|aB|a|AS|SA
B→b
X → SA
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 44
CFG to CNF
5) Now change the productions
S’ → aB, S → aB and A → aB
Finally we get:
S’ → AX|YB|a|AS|SA
S → AX|YB|a|AS SA
A → b|AX|YB|a|AS|SA
B→b
X → SA
Y→a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
A CFG is in Greibach Normal Form if the productions are in the following
forms:
A → aα
• A is a non-terminal.
• a is a single terminal symbol.
• α is a (possibly empty) sequence of non-terminals.
i.e. Every production begins with exactly one terminal, followed by zero or
more non-terminals.
Example:
S → aAB ∣ bB ∣ a
A → aA ∣ bA
B→b
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
Why Convert CFG to GNF?
• In GNF, the first symbol in every production is a terminal, which helps
deterministically know which production to use.
• Makes top-down parsing simpler and avoids left-recursion problems.
Steps to convert a given CFG to GNF:
Step 1: Check if the given CFG has any Unit Productions or Null
Productions and Remove if there are any
Step 2: Check whether the CFG is already in Chomsky Normal Form
(CNF) and convert it to CNF if it is not. (using the CFG to CNF conversion
technique discussed in the previous lecture)
Step 3: Change the names of the Non-Terminal Symbols into some Ai in
ascending order of i.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
Steps to convert a given CFG to GNF:
Step 4: Alter the rules so that the Non-Terminals are in ascending order,
such that, If the Production is of the form Ai → Aj x, then, i < j and should
never be i ≥ j.
Step 5: Remove Left Recursion if have any.
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
Example:
S → CA | BB
B → b | SB
C→b
A→a
Replace:
S with A1
C with A2
A with A3
B with A4
We get:
A1 → A2 A3 | A4 A4
A4 → b | A1 A4
A2 → b
A3 → a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
Then,
A4 → b | A2 A3 A4 | A4 A4 A4
A4 → b | b A3 A4 | A4 A4 A4 (Left Recursion)
To remove left recursion:
Z → A4 A4 Z | A4 A4
A4 → b | b A3 A4 | bZ | b A3 A4 Z
Now grammar:
A1 → A2 A3 | A4 A4
A4 → b | b A3 A4 | bZ | b A3 A4 Z
Z → A4 A4 Z | A4 A4
A2 → b
A3 → a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
Greibach Normal Form
Then,
A4 → b | A2 A3 A4 | A4 A4 A4
A4 → b | b A3 A4 | A4 A4 A4 (Left Recursion)
To remove left recursion:
Z → A4 A4 Z | A4 A4
A4 → b | b A3 A4 | bZ | b A3 A4 Z
Now grammar:
A1 → A2 A3 | A4 A4
A4 → b | b A3 A4 | bZ | b A3 A4 Z
Z → A4 A4 Z | A4 A4
A2 → b
A3 → a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
CYK Algorithm (Cocke–Younger–Kasami)
• The CYK algorithm is a parsing algorithm for context-free grammars.
• It checks whether a given string can be generated by a CFG.
• It works only on grammars in Chomsky Normal Form (CNF).
• Decides membership: Tells if a string belongs to the language of a
CFG.
• Example: Check whether the string “aaba” is a valid member of the
following CFG:
S → AB | BC
A → BA | a
B → CC | b
C → AB | a
Theory of Computation Lecture Lecture
3: CFL, CFG,
3: CFL,
andCFG,
PDAand PDA 45
End of the Lecture 3