Unit 4
Finite State Automata and Grammar
Compiled By: Bishal Trital
Introduction to Finite Automata
● Finite Automata(FA) is the simplest
machine to recognize patterns
(language recognizer).
● The finite automata or finite state
machine is an abstract machine that
has five elements or tuples.
● It has a set of states and rules for
moving from one state to another but it
depends upon the applied input
symbol.
● Basically, it is an abstract model of a
digital computer.
Compiled By: Bishal Trital
Finite Automata (Contd…)
The processing of FA consists of: Finite Control:
1. Input tape ● The processing of FA is controlled by
2. Reading head finite control.
3. Finite Control ● Contains information about current state,
Input tape: and next state for each transition
function.
● Divided into number of square cells.
● Each cell stores the symbols of input string. Scan each cell of the tape, starting from
Reading head: leftmost (or rightmost), in one direction.
Scans the symbol from the input input tape, each When the string becomes empty and if the
cell at a time, in one direction only (either L->R or finite control gives any of the final states as
R->L) next state, accept string. Else reject.
Compiled By: Bishal Trital
Finite Automata (Contd…)
A Finite Automata consists of the Alphabet
● Definition − An alphabet is any finite set of symbols.
following:
Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’,
Q : Finite set of states. ‘b’, ‘c’, and ‘d’ are symbols.
Σ : set of Input Symbols. String (w or s)
● Definition − A string is a finite sequence of symbols.
q0 : Initial state. Example − ‘cabcad’ is a valid string on the alphabet set
∑ = {a, b, c, d}
F : set of Final States.
Length of a String (|w (or s)|)
δ : Transition Function, Q x Σ = Q ● Definition − It is the number of symbols present in a
string. Examples −
Formal specification of machine is: ○ If S = ‘cabcad’, |S|= 6
○ If |S|= 0, it is called an empty string (Denoted by
M = { Q, Σ, q, F, δ }
λ or ε).
Compiled By: Bishal Trital
Finite Automata (Contd…)
Kleene Star possible strings of all possible lengths over ∑
excluding λ.
● Definition − The Kleene star, ∑*, is a unary
● Representation : ∑+ = ∑1 ∪ ∑ ∑2 ∪ ∑ ∑3 ∪ ∑…….
operator on a set of symbols or strings, ∑,
∑+ = ∑* − { λ }
that gives the infinite set of all possible
● Example : If ∑ = { a, b } , ∑+ = { a, b, aa, ab,
strings of all possible lengths over ∑ including
ba, bb,………..}
λ.
● Representation − ∑* = ∑0 ∪ ∑ ∑1 ∪ ∑ ∑2 ∪ ∑……. Language
where ∑p is the set of all possible strings of
● Definition: A language is a subset of ∑* for
length p.
some alphabet ∑. It can be finite or infinite.
● Example − If ∑ = {a, b}, ∑* = {λ, a, b, aa, ab,
● Example: If the language takes all possible
ba, bb,………..}
strings of length 2 over ∑ = {a, b}, then L =
Kleene Closure / Plus { ab, aa, ba, bb }
● Definition − The set ∑+ is the infinite set of all
Compiled By: Bishal Trital
Finite Automata (Contd…)
Finite Automata can be classified into two types:
● Deterministic Finite Automata (DFA)
● Non-deterministic Finite Automata (NDFA / NFA)
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
In DFA, for each input symbol, one can Formal Definition of a DFA
determine the state to which the machine will
A DFA can be represented by a 5-tuple (Q, ∑, δ,
move. Hence, it is called Deterministic
q0, F) where:
Automata.
● Q is a finite set of states.
As it has a finite number of states, the ● ∑ is a finite set of symbols called the
machine is called Deterministic Finite alphabet.
Automaton. ● δ is the transition function where δ: Q × ∑
→Q
● q0 is the initial state from where any input
is processed (q0 ∈ Q). Q).
● F is a set of final state/states of Q (F ⊆ Q). Q).
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Graphical Representation of a DFA
A DFA is represented by digraphs called
transition diagram / state diagram.
● The vertices represent the states.
● The arcs labeled with an input
alphabet show the transitions.
● The initial state is denoted by an
empty single incoming arc.
● The final state is indicated by double
circles.
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Transition Table:
● It is a tabular representation of
transition function of finite
automata.
● Generally, states are arranged in
rows and inputs are arranged in
column.
● The intersection of each row and
column i.e. each cell represents
the next state.
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Draw the transition diagram of DFA that Draw the transition diagram of DFA that has
starts with 1 and ends with 0. length at least 3 and its third symbol is 0.
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Draw a DFA for the language accepting strings Draw a DFA for the language accepting strings
ending with ‘00’ over input alphabets ∑={0, 1}? ending with ‘011’ over input alphabets ∑={0, 1}?
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Draw a DFA for the language accepting strings Draw a DFA for the language accepting strings
starting with ‘011’ over input alphabets ∑ = {0, with ‘011’ as substring over input alphabets ∑ =
1}? {0, 1}?
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Draw a DFA for the language accepting even Draw a DFA for the language accepting odd
binary numbers strings over input alphabets ∑ = binary numbers strings over input alphabets ∑ =
{0, 1} ? {0, 1} ?
Compiled By: Bishal Trital
Deterministic Finite Automata (Contd…)
Draw a DFA for the language accepting strings Construct a DFA accepting set of all strings
starting and ending with different characters containing even no. of 1’s and even no. of 0’s
over input alphabets ∑ = {0, 1} ? over input alphabet {0,1}.
Compiled By: Bishal Trital
Non-Deterministic Finite Automata
● In NDFA / NFA, for a particular input Formal Definition of an NDFA
symbol, the machine can move to any An NDFA can be represented by a 5-tuple (Q, ∑, δ,
combination of the states in the q0, F) where:
● Q is a finite set of states.
machine.
● ∑ is a finite set of symbols called the
● In other words, the exact state to which
alphabets.
the machine moves cannot be ● δ is the transition function, δ: Q × ∑ → 2 Q
determined. Hence, it is called Non- (Here the power set of Q (2Q) has been taken
deterministic Automata. because in case of NDFA, from a state,
● As it has finite number of states, the transition can occur to any combination of Q
machine is called Non - deterministic states)
● q0 is the initial state from where any input is
Finite Automata.
processed (q0 ∈ Q). Q).
● F is a set of final state/states of Q (F ⊆ Q). Q).
Compiled By: Bishal Trital
Non-Deterministic Finite Automata (Contd…)
Graphical Representation of a DFA
A DFA is represented by digraphs called
transition diagram / state diagram.
● The vertices represent the states.
● The arcs labeled with an input alphabet
show the transitions.
● The initial state is denoted by an empty
single incoming arc.
● The final state is indicated by double
circles.
Compiled By: Bishal Trital
Non-Deterministic Finite Automata (Contd…)
Transition Table:
● It is a tabular representation of transition
function of finite automata.
● Generally, states are arranged in rows
and inputs are arranged in column.
● The intersection of each row and
column i.e. each cell represents the Transition function δ is defined as-
next state. ● δ (A, 1) = B
● δ (A, ε) = C
● δ (B, 0) = A
● δ (B, 0) = C
● δ (B, 1) = C
Compiled By: Bishal Trital
Non-Deterministic Finite Automata (Contd…)
Design a NFA for the transition table as given
below:
Q\Σ 0 1
→qo q0, q1 q0, q2
q1 q3 ε
q2 q2, q3 q3
* q3 q3 q3
Compiled By: Bishal Trital
Non-Deterministic Finite Automata (Contd…)
Design an NFA with ∑ = {0, 1} that accepts Design an NFA with ∑ = {0, 1} accepts all
all string ending with 01. string in which the third symbol from the right
end is always 0.
Design an NFA with ∑ = {0, 1} that accepts
all strings in which double '1' is followed by Design an NFA with ∑ = {a, b} that accepts
double '0'. all strings of length two.
Compiled By: Bishal Trital
Non-Deterministic Finite Automata (Contd…)
Design an NFA with ∑ = {0, 1} that have at Draw an NFA which accept a string
least two consecutive 0s or 1s. containing “the” anywhere in a string of {a-z},
e.g., “there” but not “those”.
Compiled By: Bishal Trital
Deterministic / Non-Deterministic Finite Automata
DFA NDFA
The transition from a state is to a single The transition from a state can be to
particular next state for each input symbol. multiple next states for each input symbol.
Hence it is called deterministic. Hence it is called non-deterministic.
Empty string transitions are not seen in NDFA permits empty string transitions.
DFA.
Backtracking is allowed in DFA In NDFA, backtracking is not always
possible.
Requires more space. Requires less space.
A string is accepted by a DFA, if it transits A string is accepted by a NDFA, if at least
to a final state. one of all possible transitions ends in a
final state.
Compiled By: Bishal Trital
Acceptance of string by FSA
Compiled By: Bishal Trital
Acceptance of string by FSA (Contd…)
● A string w is accepted by a FSA (Q, Σ, δ, q 0, Check whether the string w = a(ab)*aa is
F), if and only if δ*(q0, w) ∊ F i.e. [δ(q F i.e. [δ(qδ(q0, w) -> accepted or not by FSA.
… -> (qf , є)].)].
● First from the initial state go to state 1 by
● That is a string is accepted by a DFA if and
reading one a.
only if the DFA starting at the initial state ends
● Then from state 1 go through the cycle 1-2-1
in an accepting state after reading the string.
any number of times by reading substring ab
any number of times to come back to state 1.
This is represented by (ab)*.
● Then from state 1 go to state 2 and then to
state 3(final state) by reading aa.
● Thus a string that is accepted by this DFA can
be represented by a(ab)*aa .
Compiled By: Bishal Trital
Regular Expression
A regular expression is an expression used to ● If R is regular expression, then (R) is also
generate string for a FSA. Hence. Regular regular.
expression is also called language generator. ● Homomorphism:
○ Is a substitution in which a single
Properties:
letter is replaced with a string.
● Symbols of input alphabet, empty string(є)].) ○ Eg: if h(a) = ab is regular, then:
and empty state(Φ) are regular expressions.) are regular expressions.
● If R1 and R2 be two regular expressions, then h(aa) = abab (homomorphism) is
their union denoted by R1+R2 is also regular. also regular, is a property of RE.
● If R1 and R2 be two regular expressions, then
Example:
their concatenation denoted by R1.R2 is also
regular. ● (0+1)* = {є)]., 0, 1, 00, 01, 10, 11, … }
● If R be a regular expression, then closure of ● (0.1)* = {є)]., 01, 0101, 010101, … }
R denoted by R* is also regular (Kleene star). ● 0* + 1* = {є)]., 0, 1, 00, 11, 000, 111, … }
Compiled By: Bishal Trital
Regular Expression (Contd…)
● Let L = {set of all strings over ∑ = {a, b}, ● Ends with ‘a’.
starting with single a and ending with
RE = (a+b)*a
two bs}. Write its regular expression.
● Start with ‘a’ and end with ‘b’.
RE = a(a+b)*bb
RE = a(a+b)*b
● Sting with ‘ab’ as substring.
● Starts and ends with different symbol.
RE = (a+b)*ab(a+b)*
RE = (a(a+b)*b) + (b(a+b)*a)
● Starts with ‘a’.
● Don’t end with ‘aa’.(means end with ab, ba
RE = a(a+b)* or bb)
● Containing ‘a’. RE = (a+b)*(ab+ba+bb)+(e+a+b)
RE = b*a(a+b)*
Compiled By: Bishal Trital
Regular Expression (Contd…)
● At least two ‘a’s.
RE = b*ab*a(a+b)*
● Number of ‘a’ is even.
RE = b*(ab*ab*)*
● Number of ‘a’ is odd.
RE = b*ab*(ab*ab*)*
● Ending in either aba or aaba
RE = (a+b)*(aba+aaba)
Compiled By: Bishal Trital
Regular Expression to FSA
RE = a(bc)*
RE = ba*b
RE = (a|b)*(abb|a+b)
RE = (a+b)c
Compiled By: Bishal Trital
Regular Expression to FSA (Contd…)
RE = a(a+b)*bb RE = ba+(a+bb)a*b
Compiled By: Bishal Trital
Chomsky hierarchy of Grammar (Types of Grammar)
Compiled By: Bishal Trital
Chomsky Hierarchy of Grammar (Contd…)
What is context??
aAb → bb
Here,
a is left context of A.
b is right context of A.
But, in the CFG, we write production as:
V →(V+T)*, means left and right context of variable ‘V’ is ε.
That is why these are known as Context Free Grammars(CFGs).
Compiled By: Bishal Trital
Chomsky Hierarchy of Grammar (Contd…)
Type 3 ⊂ Type 2, 1, 0. Type 2, 1, 0.
Type 2 ⊂ Type 2, 1, 0. Type 1, 0.
Type 1 ⊂ Type 2, 1, 0. Type 0.
Compiled By: Bishal Trital
Chomsky Hierarchy of Grammar (Contd…)
Identify the types of grammar: ● S→aSb | bSb | a | b
● S→aS | a No Type 3 (Middle Linear)
Right Linear ∴Type 3, 2, 1, 0Type 3, 2, 1, 0 Yes Type 2 (CFG)
● S→AB, A→a, B→b ∴Type 3, 2, 1, 0 Type 2, 1, 0
No Type 3 ● S→aS | bS | ε
Yes Type 2 (CFG) Only Right Linear ∴Type 3, 2, 1, 0Type 3, 2, 1, 0
∴Type 3, 2, 1, 0 Type 2, 1, 0
Compiled By: Bishal Trital
Context Free Grammar (Contd…)
Q. Let G = (V , Σ , R , S) where III. S→aA
V = {S,A} →abS
→abaA
Σ = {a,b}
→ababS
R = {S→aA | ε, A→bS}; →ababε
Find L(G). → (ab)2
IV. S→aA
Solution: →abS
I. S→ε →abaA
→ababS
II. S→aA .
→abS .
→abε → (ab)n
→ab Hence, language of given grammar is L(G) =
{(ab)n : n≥0}
Compiled By: Bishal Trital
Context Free Grammar (Contd…)
A context-free grammar G is defined by Solution:
four tuples, G = ( V , Σ , R , S ) , where: I. S→ab
V = finite set called variables. II. S→aSb
Σ = finite set, disjoint from V, called →aabb
terminals. III. S→aSb
R = set of rules. →aaSbb
S = start variable, s ∊ F i.e. [δ(q V. →aaaSbb
.
.
Q. Let G = (V , Σ , R , S) where →an-1Sbn-1
V = {S} →an-1abbn-1
Σ = {a,b} → anbn
R = {S→aSb, S→ab}; Hence, language of given grammar is L(G)
Find L(G). = {anbn : n≥1}
Compiled By: Bishal Trital
Derivation
The process of generating a string by using ● The process of deriving a string by expanding
sequences of production rules is called the rightmost non-terminal at each step is
derivation.Also known as parsing. called as rightmost derivation and its
geometrical representation is called as
The geometrical representation (or hierarchical rightmost derivation tree.
representation)of a derivation is called as a ● Consider the grammar, G=(V,Σ,R,S):
parse tree or derivation tree.
Σ={x,+, * ,(,)}
● The process of deriving a string by V=Σ U {T,F,E}
expanding the leftmost non-terminal at R= { E → E + T
each step is called as leftmost derivation E→T
and its geometrical representation is T→T * F
T→F
called as leftmost derivation tree.
F → (E)
F→x}
Compiled By: Bishal Trital
Derivation (Contd…)
Find LMD, RMD and parse tree for w = x+x * x. Rightmost derivation
Leftmost derivation E →E+T
E →E+T →E+T * F (Using T → T * F)
→ T + T (Using E → T) →E+T * x (Using F → x)
→ F + T (Using T → F) →E+F * x (Using T → F)
→ x + T (Using F → x) →E+x * x (Using F → x)
→x+T * F (Using T → T * F) →T+x * x (Using E → T)
→x+F * F (Using T → F) →F+x * x(Using T → F)
→x+x * F(Using F → x) →x+x * x(Using F → x)
→x+x * x(Using F → x)
Compiled By: Bishal Trital
Derivation (Contd…)
Whether we consider the leftmost
derivation or rightmost derivation, we
get the following parse tree. This is
unambiguous grammar.
Compiled By: Bishal Trital
Derivation (Contd…)
● Consider the grammar, G=(V,Σ,R,S):
Σ={n,+, * ,(,)}
V=Σ U {E}
R= { E → E + E
E→E * E
E → (E)
E→x}
Find LMD, RMD and parse tree for w
= n+n * n.
Since there are two parse trees for a
single string "n+n*n", the grammar G is
ambiguous.
Compiled By: Bishal Trital
Derivation (Contd…)
If for all the strings of a grammar, leftmost derivation
is exactly same as rightmost derivation, then that
grammar may be ambiguous or unambiguous.
Consider the grammar, S → aS / ε
● This is an example of an unambiguous
grammar.
● Here, each string have its leftmost
derivation and rightmost derivation exactly
same.
For the string, w = a, since two different parse
Now, consider the grammar, S → aS / a / ε tree exists, the grammar is ambiguous.
● This is an example of ambiguous grammar.
● Here also, each string have its leftmost
derivation and rightmost derivation exactly
same.
Compiled By: Bishal Trital