CS 373: Theory of Computation
Manoj Prabhakaran Mahesh Viswanathan
Fall 2008
1
1 Introduction
1.1 Problems and Computation
Decision Problems
Decision Problems
Given input, decide “yes” or “no”
• Examples: Is x an even number? Is x prime? Is there a path from s to t in graph G?
• i.e., Compute a boolean function of input
General Computational Problem
In contrast, typically a problem requires computing some non-boolean function, or carrying out
interactive/reactive computation in a distributed environment
• Examples: Find the factors of x. Find the balance in account number x.
• In this course, we will study decision problems because aspects of computability are captured
by this special class of problems
What Does a Computation Look Like?
• Some code (a.k.a control): the same for all instances
• The input (a.k.a problem instance)
• As the program starts executing, some memory (a.k.a state)
– Includes the values of variables (and the “program counter”)
– State evolves throughout the computation
– Often, takes more memory for larger problem instances
• But some programs do not need larger state for larger instances!
1.2 Finite Automata: Informal Overview
Finite State Computation
• Finite state: A fixed upper bound on the size of the state, independent of the size of the input
– If t-bit state, at most 2t possible states
2
• Not enough memory to hold the entire input
– “Streaming input”: automaton runs (i.e., changes state) on seeing each bit of input
An Automatic Door
Front Rear
pad pad
door
Figure 1: Top view of Door
rear front
both rear
neither front both
closed open
neither
Figure 2: State diagram of controller
• Input: A stream of events <front>, <rear>, <both>, <neither> . . .
• Controller has a single bit of state.
Finite Automata
Details
Automaton
A finite automaton has: Finite set of states, with start/initial and accepting/final states; Transitions
from one state to another on reading a symbol from the input.
Computation
Start at the initial state; in each step, read the next symbol of the input, take the transition (edge)
labeled by that symbol to a new state.
Acceptance/Rejection: If after reading the input w, the machine is in a final state then w is
accepted; otherwise w is rejected.
3
0 0
1
q0 q1
1
Figure 3: Transition Diagram of automaton
Example: Computation
• On input 1001, the computation is
1. Start in state q0 . Read 1 and goto q1 .
2. Read 0 and goto q1 .
3. Read 0 and goto q1 .
4. Read 1 and goto q0 . Since q0 is not a final state 1001 is rejected.
• On input 010, the computation is
1. Start in state q0 . Read 0 and goto q0 .
2. Read 1 and goto q1 .
3. Read 0 and goto q1 . Since q1 is a final state 010 is accepted.
0 0
1
q0 q1
1
1.3 Examples
Example I
0, 1
q0
Figure 4: Automaton accepts all strings of 0s and 1s
4
Example II
0 1
1
q0 q1
0
Figure 5: Automaton accepts strings ending in 1
Example III
0 0
1
q0 q1
1
Figure 6: Automaton accepts strings having an odd number of 1s
Example IV
1
q0 q1
1
0 0 0 0
1
q3 q2
1
Figure 7: Automaton accepts strings having an odd number of 1s and odd number of 0s
1.4 Applications
Finite Automata in Practice
• grep
5
• Thermostats
• Coke Machines
• Elevators
• Train Track Switches
• Security Properties
• Lexical Analyzers for Parsers
2 Formal Definitions
2.1 Alphabets, Strings and Languages
Alphabet
Definition 1. An alphabet is any finite, non-empty set of symbols. We will usually denote it by Σ.
Example 2. Examples of alphabets include {0, 1} (binary alphabet); {a, b, . . . , z} (English alpha-
bet); the set of all ASCII characters; {moveforward, moveback, rotate90}.
Strings
Definition 3. A string or word over alphabet Σ is a (finite) sequence of symbols in Σ. Examples
are ‘0101001’, ‘string’.
• is the empty string.
• The length of string u (denoted by |u|) is the number of symbols in u. Example, || = 0,
|011010| = 6.
• Concatenation: uv is the string that has a copy of u followed by a copy of v. Example, if
u = ‘cat0 and v = ‘nap0 then uv = ‘catnap0 . If v = the uv = vu = u.
• u is a prefix of v if there is a string w such that v = uw. Example ‘cat0 is a prefix of ‘catnap0 .
Languages
Definition 4. • For alphabet Σ, Σ∗ is the set of all strings over Σ. Σn is the set of all strings
of length n.
• A language over Σ is a set L ⊆ Σ∗ . For example L = {1, 01, 11, 001} is a language over {0, 1}.
6
Set Notation
We will often define languages using the set builder notation. Thus, L = {w ∈ Σ∗ | p(w)} is the
collection of all strings w over Σ that satisfy the property p.
Example 5. • L = {w ∈ {0, 1}∗ | |w| is even} is the set of all even length strings over {0, 1}.
• L = {w ∈ {0, 1}∗ | there is a u such that wu = 10001} is the set of all prefixes of 10001.
2.2 Deterministic Finite Automaton
Defining an Automaton
To describe an automaton, we to need to specify
• What the alphabet is,
• What the states are,
• What the initial state is,
• What states are accepting/final, and
• What the transition from each state and input symbol is.
Thus, the above 5 things are part of the formal definition.
Deterministic Finite Automata
Formal Definition
Definition 6. A deterministic finite automaton (DFA) is M = (Q, Σ, δ, q0 , F ), where
• Q is the finite set of states
• Σ is the finite alphabet
• δ : Q × Σ → Q “Next-state” transition function
• q0 ∈ Q initial state
• F ⊆ Q final/accepting states
Given a state and a symbol, the next state is “determined”.
Computation
7
Definition 7. For a DFA M = (Q, Σ, δ, q0 , F ), string w = w1 w2 · · · wk , where for each i wi ∈ Σ,
w
and states q1 , q2 ∈ Q, we say q1 −→M q2 if there is a sequence of states r0 , r1 , . . . rk such that
• r0 = q 1 ,
• for each i, δ(ri , wi+1 ) = ri+1 , and
• rk = q 2 .
w
Definition 8. For a DFA M = (Q, Σ, δ, q0 , F ) and string w ∈ Σ∗ , we say M accepts w iff q0 −→M q
for some q ∈ F .
Acceptance/Recognition and Regular Languages
Definition 9. The language accepted or recognized by a DFA M over alphabet Σ is L(M ) = {w ∈
Σ∗ | M accepts w}. A language L is said to be accepted/recognized by M if L = L(M ).
Definition 10. A language L is regular if there is some DFA M such that L = L(M ).
Simple Observations about DFAs
Proposition 11. For a DFA M , string w, and state q1 , there is exactly one state q2 such that
w
q1 −→M q2 .
Proof. By induction on |w|.
uv
Proposition 12. For DFA M , strings u and v, and states q1 and q3 , q1 −→M q3 if and only if
u v
there is a state q2 such that q1 −→M q2 and q2 −→M q3 .
2.3 Formal Example of DFA
Formal Example of DFA
0 0
1
q0 q1
Example 13. 1
Figure 8: Transition Diagram of DFA
Formally the automaton is Modd = ({q0 , q1 }, {0, 1}, δ, q0 , {q1 }) where
δ(q0 , 0) = q0 δ(q0 , 1) = q1
δ(q1 , 0) = q1 δ(q1 , 1) = q0
8
0 1
q0 q0 q1
q1 q1 q0
Figure 9: Transition Table representation
Language of Modd
Proposition 14. L(Modd ) = {w ∈ {0, 1}∗ | w has an odd number of 1s}, where Modd is as defined
before.
Proof about the language of Modd
It fails!
w
Proof. We will prove by induction on |w| that q0 −→M q1 if w has an odd number of 1s.
• Base Case: When w = , w has an even number of 1s and so observation holds vacuously.
• Induction Step w = u0: If w has an odd number of 1s then u has an odd number of 1s, and
u 0 w
so (by induction hypothesis) q0 −→M q1 . Since q1 −→M q1 , q0 −→M q1 .
• Induction Step w = u1: If w has an odd number of 1s then to show that M is in q1 after w,
we need to argue that M is in q0 after u.
Need to prove a stronger statement.
Analyzing the problem
• For the induction step w = u1, we need to argue that M is in q0 after u.
• Proving that if w has an odd number of 1s then M is in state q1 is not sufficient to show that
L(Modd ) is the set of strings that has an odd number of 1s because it does not show that only
strings with an odd number of 1s is accepted!
Corrected Proof
w
Proof. We will prove by induction on |w| that after reading w, q0 −→M q1 if and only if w has an
odd number of 1s.
• Base Case: When w = , w has an even number of 1s and M is in state q0 after w.
• Induction Step w = u0: w has an odd number of 1s iff u has an odd number of 1s, iff (by ind.
u w 0 0
hyp.) q0 −→M q1 iff q0 −→M q1 (since q0 −→M q0 and q1 −→M q1 ).
9
• Induction Step w = u1: w has an odd number of 1s iff u has an even number of 1s iff
u w 1 1
q0 −→M q0 (ind. hyp.) iff q0 −→M q1 (since q0 −→M q1 and q1 −→M q0 ).
Proving Correctness of a DFA
Proof Template
Given a DFA M having n states {q0 , q1 , . . . qn−1 } with initial state q0 , to prove that L(M ) = L, we
do the following.
1. Come up with languages L0 , L1 , . . . Ln−1 such that ∪n−1 ∗
i=0 Li = Σ and are pairwise disjoint,
i.e., Li ∩ Lj = ∅ for any i 6= j
w
2. Prove by induction on |w|, q0 −→M qi if and only if w ∈ Li
3. Show that ∪qi ∈F Li = L; the collection of all strings that reach a final state is exactly L.
3 Designing DFAs
3.1 General Method
Typical Problem
Problem
Given a language L, design a DFA M that accepts L, i.e., L(M ) = L.
How does one go about it?
Methodology
• Imagine yourself in the place of the machine, reading symbols of the input, and trying to
determine if it should be accepted.
• Remember at any point you have only seen a part of the input, and you don’t know when it
ends.
• Figure out what to keep in memory. It cannot be all the symbols seen so far: it must fit into
a finite number of bits.
10
3.2 Examples
Strings containing 0
Problem
Design an automaton that accepts all strings over {0, 1} that contain at least one 0.
Solution
What do you need to remember? Whether you have seen a 0 so far or not!
1 0, 1
qnoz 0 qzer
Figure 10: Automaton accepting strings with at least one 0.
Even length strings
Problem
Design an automaton that accepts all strings over {0, 1} that have an even length.
Solution
What do you need to remember? Whether you have seen an odd or an even number of symbols.
0, 1
qe qo
0, 1
Figure 11: Automaton accepting strings of even length.
Pattern Recognition
Problem
Design an automaton that accepts all strings over {0, 1} that have 001 as a substring, where u is a
substring of w if there are w1 and w2 such that w = w1 uw2 .
Solution
What do you need to remember? Whether you
• haven’t seen any symbols of the pattern
11
• have just seen 0
• have just seen 00
• have seen the entire pattern 001
Pattern Recognition Automaton
1 0 0, 1
0
q q0 0 q00 1 qp
1
Figure 12: Automaton accepting strings having 001 as substring.
grep Problem
Problem
Given text T and string s, does s appear in T ?
Naı̈ve Solution
=s?
z }| {
=s?
z }| {
=s?
z }| {
=s?
z }| {
=s?
z }| {
T1 T2 T3 . . . Tn Tn+1 . . . Tt
Running time = O(nt), where |T | = t and |s| = n.
grep Problem
Smarter Solution
Solution
• Build DFA M for L = {w | there are u, vs.t. w = usv}
• Run M on text T
Time = time to build M + O(t)!
Questions
12
• Is L regular no matter what s is?
• If yes, can M be built “efficiently”?
Knuth-Morris-Pratt (1977): Yes to both the above questions.
Multiples
Problem
Design an automaton that accepts all strings w over {0, 1} such that w is the binary representation
of a number that is a multiple of 5.
Solution
What do you need to remember? The remainder when divided by 5.
How do you compute remainders?
• If w is the number n then w0 is 2n and w1 is 2n + 1.
• (a.b + c) mod 5 = (a.(b mod 5) + c) mod 5
• e.g. 1011 = 11 (decimal) ≡ 1 mod 5 10110 = 22 (decimal) ≡ 2 mod 5 10111 = 23 (decimal)
≡ 3 mod 5
Automaton for recognizing Multiples
q1 0
0 0
1 1 q2
1
q0 1
0 q3
q4 0
Figure 13: Automaton recognizing strings encoding binary number that are multiples of 5.
A One k-positions from end
Problem
13
Design an automaton for the language Lk = {w | kth character from end of w is 1}
Solution
What do you need to remember? The last k characters seen so far!
Formally, Mk = (Q, {0, 1}, δ, q0 , F )
• States = Q = {hwi | w ∈ {0, 1}∗ and |w| ≤ k}
hwbi if |w| < k
• δ(hwi, b) =
hw2 w3 . . . wk bi if w = w1 w2 . . . wk
• q0 = hi
• F = {h1w2 w3 . . . wk i | wi ∈ {0, 1}}
Lower Bound on DFA size
Proposition 15. Any DFA recognizing Lk has at least 2k states.
Proof
Let M , with initial state q0 , recognize Lk and assume (for contradiction) that M has < 2k states.
• Number of strings of length k = 2k
w0
• There must be two distinct string w0 and w1 of length k such that for some state q, q0 −→ M q
w1
and q0 −→M q.
Proof (contd)
Proof
Let i be the first position where w0 and w1 differ. Without loss of generality assume that w0 has
0 in the ith position and w1 has 1.
k
z }| {
i−1
w0 0 = . . . 0 . . . 0i−1
w1 0i−1 = |{z} . . . 0i−1
. . . 1 |{z}
i−1 k−i
w0 0i−1 6∈ Lk and w1 0i−1 ∈ Lk . Thus, M cannot accept both w0 0i−1 and w1 0i−1 .
Proof (contd)
. . . Almost there
Proof
14
w w
So far, w0 0i−1 6∈ Ln , w1 0i−1 ∈ Ln , q0 −→
0 1
M q, and q0 −→M q.
w 0i−1
0 0i−1
q0 −→ M q1 iff q −→M q1
w 0i−1
1
iff q0 −→ M q1
Thus, M accepts or rejects both w0 0i−1 and w1 0i−1 . Contradiction!
15