Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
220 views210 pages

Lec. Notes

Lec. Notes

Uploaded by

Hasen Bebba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
220 views210 pages

Lec. Notes

Lec. Notes

Uploaded by

Hasen Bebba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 210

Automata and Computability

Fall 2014

Alexis Maciel
Department of Computer Science
Clarkson University

Copyright
c 2014 Alexis Maciel

ii

Contents
Preface

vii

1 Introduction

2 Finite Automata
2.1 Turing Machines . . . . . . . . . .
2.2 Introduction to Finite Automata
2.3 Formal Definition . . . . . . . . .
2.4 More Examples . . . . . . . . . .
2.5 Closure Properties . . . . . . . . .
3 Nondeterministic Finite Automata
3.1 Introduction . . . . . . . . . . .
3.2 Formal Definition . . . . . . . .
3.3 Equivalence with DFAs . . . . .
3.4 Closure Properties . . . . . . . .

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

4 Regular Expressions
4.1 Introduction . . . . . . . . . . . . . . . . . .
4.2 Formal Definition . . . . . . . . . . . . . . .
4.3 More Examples . . . . . . . . . . . . . . . .
4.4 Converting Regular Expressions into DFAs

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.
.

5
. 5
. 9
. 16
. 20
. 31

.
.
.
.

.
.
.
.

41
41
49
55
69

.
.
.
.

79
79
82
83
85

.
.
.
.

iv

CONTENTS
4.5 Converting DFAs into Regular Expressions . . . . . . . . . . . . . . . 89
4.6 Precise Description of the Algorithm . . . . . . . . . . . . . . . . . . 100

5 Nonregular Languages
111
5.1 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2 The Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6 Context-Free Languages
6.1 Introduction . . . . . . . . .
6.2 Formal Definition of CFGs .
6.3 More Examples . . . . . . .
6.4 Ambiguity and Parse Trees

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

123
. 123
. 126
. 127
. 130

7 Non Context-Free Languages


135
7.1 The Basic Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.2 A Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.3 A Stronger Pumping Lemma . . . . . . . . . . . . . . . . . . . . . . . 143
8 More on Context-Free Languages
149
8.1 Closure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.2 Pushdown Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
8.3 Deterministic Algorithms for CFLs . . . . . . . . . . . . . . . . . . . . 155
9 Turing Machines
9.1 Introduction . . . . . . . . . . . . . . . .
9.2 Formal Definition . . . . . . . . . . . . .
9.3 Variations on the Basic Turing Machine
9.4 Equivalence with Programs . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

159
. 159
. 162
. 167
. 170

10 Problems Concerning Formal Languages


175
10.1 Regular Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
10.2 CFLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

CONTENTS
11 Undecidability
11.1 An Unrecognizable Language . . . . . .
11.2 Natural Undecidable Languages . . . .
11.3 Reducibility and Additional Examples .
11.4 Rices Theorem . . . . . . . . . . . . . .
11.5 Natural Unrecognizable Languages . .
Index

v
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

181
. 181
. 183
. 185
. 192
. 195
201

vi

CONTENTS

Preface
After a few computer science courses, students may start to get the feeling that
programs can always be written to solve any computational problem. Writing
the program may be hard work. For example, it may involve learning a difficult
technique. And many hours of debugging. But with enough time and effort, the
program can be written.
So it may come as a surprise that this is not the case: there are computational
problems for which no program exists. And these are not ill-defined problems
Can a computer fall in love? or uninteresting toy problems. These are
precisely defined problems with important practical applications.
Theoretical computer science can be briefly described as the mathematical
study of computation. These notes will introduce you to this branch of computer science by focusing on computability theory and automata theory. You
will learn how to precisely define what computation is and why certain computational problems cannot be solved. You will also learn several concepts and
techniques that have important applications. Chapter 1 provides a more detailed
introduction to this rich and beautiful area of computer science.
These notes were written for the course CS345 Automata Theory and Formal
Languages taught at Clarkson University. The course is also listed as MA345 and
CS541. The prerequisites are CS142 (a second course in programming) and
MA211 (a course on discrete mathematics in which students gain experience
writing mathematical proofs).
These notes were typeset using LaTeX (MiKTeX implementation with the TeX-

viii

PREFACE

works environment). The paper size and margins are set small to make it easier
to read the notes on a small screen. If the notes are to be printed, it is recommended that they be printed double-sided and at Actual size, not resized to
Fit the paper.
Feedback on these notes is welcome.
Please send comments to
[email protected].

Chapter 1
Introduction
In this chapter, we introduce the subject of these notes, automata theory and
computability theory. We explain what this is and why it is worth studying.
Computer science can be divided into two main branches. One branch is concerned with the design and implementation of computer systems, with a special
focus on software. (Computer hardware is the main focus of computer engineering.) This includes not only software development but also research into
subjects like operating systems and computer security. This branch of computer
science is called applied computer science. It can be viewed as the engineering
component of computer science.
The other branch of computer science is the mathematical study of computation. One of its main goals is to determine which computational problems can
and cannot be solved, and which ones can be solved efficiently. This involves
discovering algorithms for computational problems, but also finding mathematical proofs that such algorithms do not exist. This branch of computer science is
called theoretical computer science or the theory of computation.1 It can be viewed
1

The two main US-based theory conferences are the ACMs Symposium on Theory of Computing (STOC) and the IEEEs Symposium on Foundations of Computing (FOCS). One of the main European theory conferences is the Symposium on Theoretical Aspects of Computer Science (STACS).

CHAPTER 1. INTRODUCTION

as the science component of computer science.2


To better illustrate what theoretical computer science is, consider the Halting
Problem. The input to this computational problem is a program P written in
some fixed programming language. To keep things simple and concrete, lets
limit the problem to C++ programs whose only input is a single text file. The
output of the Halting Problem is the answer to the following question: Does the
program P halt on every possible input? In other words, does P always halt no
matter how it is used?
It should be clear that the Halting Problem is both natural and relevant.
Software developers already have a tool that determines if the programs they
write are correctly written according to the rules of the language they are using.
This is part of what the compiler does. It would clearly be useful if software
developers also had a tool that could determine if their programs are guaranteed
to halt on every possible input.
Unfortunately, it turns out such a tool does not exist. Not that it currently
does not exist and that maybe one day someone will invent one. No, it turns out
that there is no algorithm that can solve the Halting Problem.
How can something like that be known? How can we know for sure that a
computational problem has no algorithms, none, not now, not ever? Perhaps
the only way of being absolutely sure of anything is to use mathematics... What
this means is that we will define the Halting Problem precisely and then prove
a theorem that says that no algorithm can decide the Halting Problem.
Note that this also requires that we define precisely what we mean by an
algorithm or, more generally, what it means to compute something. Such a definition is called a model of computation. We want a model thats simple enough
so we can prove theorems about it. But we also want this model to be close
enough to real-life computation so we can claim that the theorems we prove say
something thats relevant about real-life computation.
2

Note that many areas of computer science have both applied and theoretical aspects. Artificial intelligence is one example. Some people working in AI produce actual systems that can be
used in practice. Others try to discover better techniques by using theoretical models.

3
The general model of computation we will study is the Turing machine. We
wont go into the details right now but the Turing machine is easy to define
and, despite the fact that it is very simple, it captures the essential operations of
real-life digital computers.3
Once we have defined Turing machines, we will be able to prove that no
Turing machine can solve the Halting Problem. And we will take this as clear
evidence that no real-life computer program can solve the Halting Problem.
The above discussion was meant to give you a better idea of what theoretical
computer science is. The discussion focused on computability theory, the study
of which computational problems can and cannot be solved. Its useful at this
point to say a bit more about why theoretical computer science is worth studying
(either as a student or as a researcher).
First, theoretical computer science provides critical guidance to applied computer scientists. For example, because the Halting Problem cannot be solved by
any Turing machine, applied computer scientists do not waste their time trying
to write programs that solves this problem, at least not in full generality. There
are many other other examples, many of which are related to program verification.
Second, theoretical computer science involves concepts and techniques that
have found important applications. For example, regular expressions, and algorithms that manipulate them, are used in compiler design and in the design of
many programs that process input.
Third, theoretical computer science is intellectually interesting, which leads
some people to study it just out of curiosity. This should not be underestimated:
many important scientific and mathematical discoveries have been made by people who were mainly trying to satisfy their intellectual curiosity. A famous example is number theory, which plays a key role in the design of modern cryptographic systems. The mathematicians who investigated number theory many
years ago had no idea that their work would make possible electronic commerce
3

In fact, the Turing machine was used by its inventor, Alan Turing, as a basic mathematical
blueprint for the first digital computers.

CHAPTER 1. INTRODUCTION

as we know it today.
The plan for the rest of these notes is as follows. The first part of the notes will
focus on finite automata, which are essentially Turing machines without memory.
The study of finite automata is good practice for the study of Turing machines.
But we will also learn about the regular expressions we mentioned earlier. Regular expressions are essentially a way of describing patterns in strings. We will
learn that regular expressions and finite automata are equivalent, in the sense
that the patterns that can be described by regular expressions are also the patterns that can be recognized by finite automata. And we will learn algorithms
that can convert regular expressions into finite automata, and vice-versa. These
are the algorithms that are used in the design of programs, such as compilers,
that process input. We will also learn how to prove that certain computational
problems cannot be solved by finite automata, which also means that certain
patterns cannot be described by regular expressions.
Next, we will study context-free grammars, which are essentially an extension of regular expressions that allows the description of more complex patterns.
Context-free grammars are needed for the precise definition of modern programming languages and therefore play a critical role in the design of compilers.
The final part of the course will focus on general Turing machines and will
culminate in the proof that certain problems, such as the Halting Problem, cannot be solved by Turing machines.

Chapter 2
Finite Automata
In this chapter, we study a very simple model of computation called a finite automaton. Finite automata are useful for solving certain problems but studying
finite automata is also good practice for the study of Turing machines, the general model of computation we will study later in these notes.

2.1

Turing Machines

As explained in the previous chapter, we need to define precisely what we mean


by an algorithm, that is, what we mean when we say that something is computable. Such a definition is called a model of computation. As we said, we want
a model thats simple enough so we can prove theorems about it, but a model
thats also close enough to real-life computation so we can claim that the theorems we prove about our model say something thats relevant about real-life
computation.
A first idea for a model of computation is any of the currently popular highlevel programming languages. C++, for example. A C++ program consists
of instructions and variables. We could define C++ precisely but C++ is not

CHAPTER 2. FINITE AUTOMATA

simple. It has complex instructions such as loops and conditional statements,


which can be nested within each other, as well as various other features such as
type conversions, parameter passing and inheritance.
A simpler model would be a low-level assembler or machine language. These
languages are much simpler. They have no variables, no functions, no types and
only very simple instructions. An assembler program is a linear sequence of
instructions (no nesting). These instructions directly access data that is stored
in memory or in a small set of registers. One of these registers is the program
counter, or instruction counter, that keeps track of which instruction is currently
being executed. Typical instructions allow you to set the contents of a memory
location to a given value, or copy the contents of a memory location to another
one. Despite their simplicity, it is widely accepted that anything that can be
computed by a C++ program can be computed by an assembler program. The
evidence is that we have compilers that translate C++ programs into assembler.
But it is possible to define an even simpler model of computation. In this
model, instructions can no longer access memory locations directly by specifying
their address. Instead, we have a memory head that points to a single memory
location. Instructions can access the data under the memory head and then
move the head one location to the right or one location to the left. In addition,
there is only one type of instruction in this model. Each instruction specifies,
for each possible value that could be stored in the current memory location (the
one under the memory head), a new value to be written at that location as well
as the direction in which the memory head should be moved. For example, if
a, b, c are the possible values that could be stored in each memory location, the
table in Figure 2.1 describes one possible instruction. Such a table is called a
transition table. A program then consists of a simple loop that executes one of
these instructions.
This is a very simple model of computation but we may have gone too far:
since the value we write in each memory location depends only on the contents
of that memory location, there is no way to copy a value from one memory location to another. To fix this without reintroducing more complicated instructions,

2.1. TURING MACHINES

a b, R
b b, L
c

a, R

Figure 2.1: A transition table


a

q0 q1 , b, R q0 , a, L q2 , b, R
q1 q1 , a, L q1 , c, R q0 , b, R
q2 q0 , c, R q2 , b, R q1 , c, L
Figure 2.2: A transition table
we simply add to our model a special register we call the state of the program.
Each instruction will now have to consider the current state in addition to the
current memory location. A sample transition table is shown in Figure 2.2, assuming that q0 , q1 , q2 are the possible states.
The model we have just described is called the Turing machine. (Each program in this model is considered a machine.) To complete the description of
the model, we need to specify a few more details. First, we restrict our attention to decision problems, which are computational problems in which the input
is a string and the output is the answer to some yes/no question about the input. The Halting Problem mentioned in the previous chapter is an example of a
decision problem.
Second, we need to describe how the input is given to a Turing machine,

CHAPTER 2. FINITE AUTOMATA


memory

control
unit
yes/no

Figure 2.3: A Turing machine

how the output is produced by the machine, and how the machine terminates
its computation. For the input, we assume that initially, the memory of the
Turing machine contains the input and nothing else. For the output, each Turing
machine will have special yes and no states. Whenever one of these states is
entered, the machine halts and produces the corresponding output. This takes
care of termination too.
Figure 2.3 shows a Turing machine. The control unit consists of the transition
table and the state.
The Turing machine is the standard model that is used to study computation mathematically. (As mentioned earlier, the Turing machine was used by
its inventor, Alan Turing, as a basic mathematical blueprint for the first digital
computers.) The Turing machine is clearly a simple model but it is also relevant
to real-life computations because it is fairly easy to write a C++ program that
can simulate any Turing machine, and because Turing machines are powerful
enough to simulate typical assembler instructions. This last point will become
clearer later in these notes when we take a closer look at Turing machines.

2.2. INTRODUCTION TO FINITE AUTOMATA

input (read-only)

control
unit
yes/no

Figure 2.4: A finite automaton

2.2

Introduction to Finite Automata

Before studying general Turing machines, we will study a restriction of Turing


machines called finite automata. A finite automaton is essentially a Turing machine without memory. A finite automaton still has a state and still accesses its
input one symbol at a time but the input can only be read, not written to, as
illustrated in Figure 2.4. We will study finite automata for at least three reasons.
First, this will help us understand what can be done with the control unit of a
Turing machine. Second, this will allow us to learn how to study computation
mathematically. Finally, finite automata are actually useful for solving certain
types of problems.
Here are a few more details about finite automata. First, in addition to being
read-only, the input must be read from left to right. In other words, the input
head can only move to the right. Second, the computation of a finite automaton
starts at the beginning of the input and automatically ends when the end of
the input is reached, that is, after the last symbol of the input has been read.
Finally, some of the states of the automaton are designated as accepting states.
If the automaton ends its computation in one of those states, then we say that
the input is accepted. In other words, the output is yes. On the other hand, if

10

CHAPTER 2. FINITE AUTOMATA


underscore letter digit other
q0

q1

q1

q2

q2

q1

q1

q1

q1

q2

q2

q2

q2

q2

q2

Figure 2.5: The transition table of a finite automaton that determines if the input
is a valid C++ identifiers
the automaton ends its computation in a non-accepting state, then the input is
rejected the output is no.
Now that we have a pretty good idea of what a finite automaton is, lets look
at a couple of examples. A valid C++ identifier consists of an underscore or a
letter followed by any number of underscores, letters and digits. Consider the
problem of determining if an input string is a valid C++ identifier. This is a
decision problem.
Figure 2.5 shows the transition table of a finite automaton that solves this
problem. As implied by the table, the automaton has three states. State q0 is the
start state of the automaton. From that state, the automaton enters state q1 if
it the first symbol of the input is an underscore or a letter. The automaton will
remain in state q1 as long as the input consists of underscores, letter and digits.
In other words, the automaton finds itself in state q1 if the portion of the string
it has read so far is a valid C++ identifier.
On the other hand, the automaton enters state q2 if it has decided that the
input cannot be a valid C++ identifier. If the automaton enters state q2 , it will
never be able to leave. This corresponds to the fact that once we see a character
that causes the string to be an invalid C++ identifier, there is nothing we can see
later that could fix that. A non-accepting state from which you cannot escape is
sometimes called a garbage state.
State q1 is accepting because if the computation ends in that state, then this

2.2. INTRODUCTION TO FINITE AUTOMATA

11

underscore,
letter,
digit

q0

underscore,
letter

digit,
other

q1

other

q2

any

Figure 2.6: The transition graph of a finite automaton that determines if the
input is a valid C++ identifiers

implies that the entire string is a valid C++ identifier. The start state is not an
accepting state because the empty string, the one with no characters, is not a
valid identifier.
Lets look at some sample strings. When reading the string input_file,
the automaton enters state q1 and remains there until the end of the string.
Therefore, this string is accepted, which is correct. On the other hand, when
reading the string inputfile, the automaton enters state q1 when it sees the
first i but then leaves for state q2 when it encounters the dash. This will cause
the string to be rejected, which is correct since dashes are not allowed in C++
identifiers.
A finite automaton can also be described by using a graph. Figure 2.6 shows
the transition graph of the automaton for valid C++ identifiers. Each state is

12

CHAPTER 2. FINITE AUTOMATA


d
q0

q1

q2

q3

q4

q5

q6

q7

q8

Figure 2.7: A finite automaton that determines if the input is a correctly formatted phone number
represented by a node in this graph. Each edge connects two states and is labeled
by an input symbol. If an edge goes from q to q0 and is labeled a, this indicates
that when in state q and reading an a, the automaton enters state q0 . Each
such step is called a move or a transition (hence the terms transition table and
transition graph). The start state is indicated with an arrow and the accepting
states have a double circle.
Transition graphs and transition tables provide exactly the same information.
But the graphs make it easier to visualize the computation of the automaton,
which is why we will draw transition graphs whenever possible. There are some
circumstances, however, where it is impractical to draw a graph. We will see
examples later.
Lets consider another decision problem, the problem of determining if an
input string is a phone number in one of the following two formats: 7 digits, or
3 digits followed by a dash and 4 digits. Figure 2.7 shows a finite automaton
that solves this problem. The transition label d stands for digit.
In a finite automaton, from every state, there should be a transition for every
possible input symbol. This means that the graph of Figure 2.7 is missing many
transitions. For example, from state q0 , there is no transition labeled and
there is no transition labeled by a letter. All those missing transitions correspond
to cases in which the input string should be rejected. Therefore, we can have
all of those missing transitions go to a garbage state. We chose not to draw
those transitions and the garbage state simply to avoid cluttering the diagram

2.2. INTRODUCTION TO FINITE AUTOMATA

13

if (end of input) return no


char c
read c
if (c is not underscore or letter) return no
while (not end of input)
read c
if (c is not underscore, letter or digit) return no
return yes
Figure 2.8: A simple algorithm that determines if the input is a valid C++ identifier
unnecessarily.
It is interesting to compare the finite automata we designed in this section
with C++ programs that solve the same problems. For example, Figure 2.8
shows an algorithm that solves the C++ identifier problem. The algorithm is
described in pseudocode, which is simpler than C++. But even by looking at
pseudocode, the simplicity of finite automata is evident. The pseudocode algorithm uses variables and different types of instructions. The automaton, on the
other hand, consists of only states and transitions. This is consistent with what
we said earlier, that Turing machines, and therefore finite automata, are a much
simpler model of computation that a typical high-level programming language.
At the beginning of this section, we said that finite automata can be useful
for solving certain problems. This relies on the fact that finite automata can be
easily converted into programs, as shown in Figure 2.9. All that we need to add
to this algorithm is a function start_state that returns the start state of the
finite automaton, a function next_state(s, c) that, for every pair (s, c)
where s is a state and c is an input character, returns the next state according to the transition table (or graph) of the finite automaton, and a function
is_accepting(s) that returns true if state s is an accepting state.
Therefore, a possible strategy for solving a decision problem is to design a

14

CHAPTER 2. FINITE AUTOMATA

char c
state s = start_state()
while (not end of input)
read c
s = next_state(s, c)
if (is_accepting(s))
return yes
else
return no
Figure 2.9: An algorithm that simulates a finite automaton
finite automaton and then convert it to a pseudocode algorithm as explained
above. In some cases, this can be easier than designing a pseudocode algorithm
directly, for several reasons. One is that the simplicity of the finite automaton model can help us focus more clearly on the problem itself, without being
distracted by the various features that can be used in a high-level pseudocode
algorithm. Another reason is that the computation of a finite automaton can be
visualized by its transition graph, making it easier to understand what is going
on. Finally, when designing a finite automaton we have to include transitions
for every state and every input symbol. This helps to ensure that we consider all
the necessary cases.
The above strategy is used in the design of compilers and other programs
that perform input processing.

Study Questions
2.2.1. Can finite automata solve problems that high-level pseudocode algorithms cannot?
2.2.2. What are three advantages of finite automata over high-level pseudocode

2.2. INTRODUCTION TO FINITE AUTOMATA

15

algorithms?

Exercises
2.2.3. Consider the problem of determining if a string is an integer in the following format: an optional minus sign followed by at least one digit. Design
a finite automaton for this problem.
2.2.4. Consider the problem of determining if a string is a number in the following format: an optional minus sign followed by at least one digit, or
an optional minus sign followed by any number of digits, a decimal point
and at least one digit. Design a finite automaton for this problem.
2.2.5. Suppose that a valid C++ identifier is no longer allowed to consist of only
underscores. Modify the finite automaton of Figure 2.6 accordingly.
2.2.6. Add optional area codes to the phone number problem we saw in this
section. That is, consider the problem of determining if a string is a phone
number in the following format: 7 digits, or 10 digits, or 3 digits followed
by a dash and 4 digits, or 3 digits followed by a dash, 3 digits, another
dash and 4 digits. Design a finite automaton for this problem.
2.2.7. Convert the finite automaton of Figure 2.6 into a high-level pseudocode
algorithm by using the technique explained in this section. That is, write
pseudocode for the functions starting_state(), next_state(s,
c) and is_accepting(s) of Figure 2.9.
2.2.8. Repeat the previous exercise for the finite automaton of Figure 2.7. (Dont
forget the garbage state.)

16

CHAPTER 2. FINITE AUTOMATA

2.3

Formal Definition

In this section, we define precisely what we mean by a finite automaton. This


is necessary for proving mathematical statements about finite automata and for
writing programs that manipulate finite automata.
From the discussion of the previous section, it should be clear that a finite
automaton consists of four things:
A finite set of states.
A special state called the starting state.
A subset of states called accepting states.
A transition table or graph that specifies a next state for every possible pair
(state, input character).
Actually, to be able to specify all the transitions, we also need to know what the
possible input symbols are. This information should also be considered part of
the finite automaton:
A set of possible input symbols.
We can also define what it means for a finite automaton to accept a string:
run the algorithm of Figure 2.9 and accept if the algorithm returns yes.
The above definition of a finite automaton and its operation should be pretty
clear. But it has a couple of problems. The first one is that it doesnt say exactly
what a transition table or graph is. That wouldnt be too hard to fix but the
second problem with the above definition is more serious: the operation of the
finite automaton is defined in terms of a high-level pseudocode algorithm. For
this definition to be complete, we would need to also define what those algorithms are. But recall that we are interested in finite automata mainly because
they are supposed to be easy to define. Finite automata are not going to be

2.3. FORMAL DEFINITION

17

simpler than another model if their definition includes a definition of that other
model.
In the rest of this section, we will see that it is possible to define a finite automaton and its operation without referring to either graphs or algorithms. This
will be our formal definition of a finite automaton, in the sense that it is precise
and complete. The above definition, in terms of a graph and an algorithm, will
be considered an informal definition.
The formal definition of a finite automaton will be done as a sequence of
definitions. The purpose of most of these is only to define useful terms.
Definition 2.1 An alphabet is a finite set whose elements are called symbols.
This definition allows us to talk about the input alphabet of a finite automaton instead of having to say the set of possible input symbols of a finite
automaton. In this context, symbols are sometimes also called letters or characters, as we did in the previous section.
Definition 2.2 A deterministic
(Q, , , q0 , F ) where

finite

automaton

(DFA)

is

5-tuple

1. Q is a finite set of states.1


2. is an alphabet called the input alphabet.
3. : Q Q is the transition function.
4. q0 Q is the starting state.
5. F Q is the set of accepting states.
Note that this definition is for deterministic finite automata. We will encounter another type of finite automata later in these notes.
1

It would be more correct to say that Q is a finite set whose elements are called states, as we
did in the definition of an alphabet. But this shorter style is easier to read.

18

CHAPTER 2. FINITE AUTOMATA

Example 2.3 Consider the DFA shown in Figure 2.6. Here is what this DFA looks
like according to the formal definition. The DFA is ({q0 , q1 , q2 }, , , q0 , {q1 })
where is the set of all characters that appear on a standard keyboard and is
defined as follows:

q1 if c is an underscore or a letter
(q0 , c) =
q2 otherwise

q1 if c is an underscore, a letter or a digit


(q1 , c) =
q2 otherwise
(q2 , c) = q2 ,

for every c .

This is called the formal description of the DFA.

u
t

Another way of presenting a transition function is with a table, as shown


in Figure 2.1. Equations can be more concise and easier to understand. The
transition function can also be described by a transition graph, as shown in Figure 2.6. As mentioned earlier, transition graphs have the advantage of helping
us visualize the computation of the DFA.
We now define what it means for a DFA to accept its input string. Without
referring to an algorithm. Instead, we will only talk about the sequence of states
that the DFA goes through while processing its input string.
Definition 2.4 A string over an alphabet A is a finite sequence of symbols from A.
Definition 2.5 Let M = (Q, , , q0 , F ) be a DFA and let w = w1 w n be a string
of length n over .2 Let r0 , r1 , . . . , rn be the sequence of states defined by
r0 = q 0
ri = (ri1 , w i ),

for i = 1, . . . , n.

Then M accepts w if and only if rn F .


2

It is understood here that w1 , . . . , w n are the individual symbols of w.

2.3. FORMAL DEFINITION

19

Note how the sequence of states is defined by only referring to the transition
function, without referring to an algorithm that computes that sequence of states.
Each DFA solves a problem by accepting some strings and rejecting the others. Another way of looking at this is to say that the DFA recognizes a set of
strings, those that it accepts.
Definition 2.6 A language over an alphabet A is a set of strings over A.
Definition 2.7 The language recognized by a DFA M (or the language of M ) is
the following set:
L(M ) = {w | w is accepted by M }
where is the input alphabet of M and denotes the set of all strings over .
The languages that are recognized by DFAs are called regular.
Definition 2.8 A language is regular if it is recognized by some DFA.
Many interesting languages are regular but we will soon learn that many
others are not. Those languages require algorithms that are more powerful than
DFAs.

Study Questions
2.3.1. What is the advantage of the formal definition of a DFA over the informal
definition presented at the beginning of this section?
2.3.2. What is an alphabet?
2.3.3. What is a string?
2.3.4. What is a language?

20

CHAPTER 2. FINITE AUTOMATA

2.3.5. What is a DFA?


2.3.6. What does it mean for a DFA to accept a string?
2.3.7. What does it mean for a language to be recognized by a DFA?
2.3.8. What is a regular language?

Exercises
2.3.9. Give a formal description of the DFA of Figure 2.7. (Dont forget the
garbage state.)
2.3.10. Give a formal description of the DFA of Exercise 2.2.3.
2.3.11. Give a formal description of the DFA of Exercise 2.2.4.

2.4

More Examples

The examples of DFAs we have seen so far, in the text and in the exercises, have
been inspired by real-world applications. We now take a step back and consider
what else can DFAs do. We will see several languages that dont have obvious
applications but that illustrate basic techniques that are useful in the design of
DFAs.
Unless otherwise specified, in the examples of this section the input alphabet
is {0, 1}.
Example 2.9 Figure 2.10 shows a DFA for the language of strings that start
with 1.
t
u
Example 2.10 Now consider the language of strings that end in 1. One difficulty
here is that there is no mechanism in a DFA that allows us to know whether the

2.4. MORE EXAMPLES

21

0, 1
q0

q1

0
q2

0, 1

Figure 2.10: A DFA for the language of strings that start with 1
symbol we are currently looking at is the last symbol of the input string. So we
have to always be ready, as if the current symbol was the last one.3 What this
means is that after reading every symbol, we have to be in an accept state if and
only if the portion of the input string weve seen so far is in the language.
Figure 2.11 shows a DFA for this language. Strings that begin in 1 lead to
state q1 while strings that begin in 0 lead to state q2 . Further 0s and 1 cause
the DFA to move between these states as needed.
Notice that the starting state is not an accepting state because the empty
string, the string of length 0 that contains no symbols, does not end in 1. But
then, states q0 and q2 play the same role in the DFA: theyre both non-accepting
states and the transitions coming out of them lead to the same states. This
implies that these states can be merged to get the slightly simpler DFA shown in
Figure 2.12.
t
u
Example 2.11 Consider the language of strings of length at least two that begin
and end with the same symbol. A DFA for this language can be obtained by
combining the ideas of the previous two examples, as shown in Figure 2.13 t
u
3

To paraphrase a well-known quote attributed to Jeremy Schwartz, Read every symbol as if


it were your last. Because one day, it will be.

22

CHAPTER 2. FINITE AUTOMATA

1
q0

q1

0
1

q2

Figure 2.11: A DFA for the language of strings that end in 1

0
1
q0

q1

Figure 2.12: A simpler DFA for the language of strings that end in 1

2.4. MORE EXAMPLES

23

0
1
q1

q2

1
q0

1
0

q4

q3

Figure 2.13: A DFA for the language of strings of length at least two that begin
and end with the same symbol

24

CHAPTER 2. FINITE AUTOMATA

q0

0, 1

1
0

q1

q2

q3

Figure 2.14: A DFA for the language of strings that contain the substring 001
Example 2.12 Consider the language of strings that contain the string 001 as
a substring. What this means is that the symbols 0, 0, 1 occur consecutively
within the input string. For example, the string 0100110 is in the language but
0110110 is not.
Figure 2.14 shows a DFA for this language. The idea is that the DFA remembers the longest prefix of 001 that ends the portion of the input string that has
been seen so far. For example, initially, the DFA has seen nothing, so the starting
state corresponds to the empty string, which we denote ". If the DFA then sees
a 0, it moves to state q1 . If it then sees a 1, then the portion of the input string
that the DFA has seen so far ends in 01, which is not a prefix of 001. So the DFA
goes back to state q0 .
t
u
Example 2.13 Consider the language of strings that contain an even number of
1s. Initially, the number of 1s is 0, which is even. After seeing the first 1, that
number will be 1, which is odd. After seeing each additional 1, the DFA will
toggle back and forth between even and odd. This idea leads to the DFA shown
in Figure 2.15. Note how the input symbol 0 never affects the state of the DFA.
We say that in this DFA, and with respect to this language, the symbol is neutral.
t
u
Example 2.14 The above example can be generalized. A number is even if it is
a multiple of 2. So consider the language of strings that contain a number of 1s

2.4. MORE EXAMPLES

25

0
1

q0

q1

Figure 2.15: A DFA for the language of strings that contain an even number of
1s

0
q0

0
1

q1

0
1

q2

Figure 2.16: A DFA for the language of strings that contain a number of 1s thats
a multiple of 3
thats a multiple of 3. The idea is to count modulo 3, as shown in Figure 2.16.
u
t
Example 2.15 We can generalize this even further. For every number k 2,
consider the language of strings that contain a number of 1s thats a multiple of
k. Note that this defines an infinite number of languages, one for every possible
value of k. Each one of those languages is regular since, for every k 2, a DFA
can be constructed to count modulo k, as shown in Figure 2.17.
t
u
Example 2.16 Lets go back the modulo 3 counting example and generalize it
in another way. Suppose that the input alphabet is now {0, 1, 2, . . . , 9} and

26

CHAPTER 2. FINITE AUTOMATA

0
q0

0
1

q1

0
1

q2

qk1

Figure 2.17: A DFA for the language of strings that contain a number of 1s thats
a multiple of k
consider the language of strings that have the property that the sum of their
digits is a multiple of 3. For example, 315 is in the language because 3+1+5 = 9
is a multiple of 3. The idea is still to count modulo 3 but there are now more
cases to consider, as shown in Figure 2.18.
u
t
Example 2.17 Now lets go all out and combine the generalizations of the last
two examples. That is, over the alphabet {0, 1, 2, . . . , 9}, for every number k,
lets consider the language of strings that have the property that the sum of their
digits is a multiple of k. Once again, the idea is to count modulo k. For example,
Figure 2.19 shows the DFA for k = 4.
It should be pretty clear that a DFA exists for every k. But the transition
diagram would be difficult to draw and it would likely be ambiguous. This is
an example where we are better off describing the DFA in words, that is, by
giving a formal description. Here it is: the DFA is (Q, , , q0 , F ) where
Q = {q0 , q1 , q2 , . . . , qk1 }
= {0, 1, 2, . . . , 9}
F = {q0 }

2.4. MORE EXAMPLES

27

1,4,7
0,3,6,9

0,3,6,9
1,4,7

q0

0,3,6,9
1,4,7

q1

q2

2,5,8

2,5,8
2,5,8

Figure 2.18: A DFA for the language of strings whose digits add to a multiple of
3
and is defined as follows: for every i Q and c ,
(qi , c) = q j ,

where j = (i + c) mod k.
u
t

Example 2.18 We end this section with three simple DFAs for the basic but
important languages (the language of all strings), ; (the empty language)
and {"} (the language that contains only the empty string). The DFAs are shown
in Figure 2.20.
u
t

Exercises
2.4.1. Modify the DFA of Example 2.11 so that strings of length 1 are also accepted.
2.4.2. Give DFAs for the following languages. In all cases, the alphabet is {0, 1}.

28

CHAPTER 2. FINITE AUTOMATA

1,5,9

0,4,8

2,6

2,6

0,4,8

0,4,8

1,5,9
q0

1,5,9
q1

0,4,8
1,5,9
q3

q2

3,7

3,7

3,7
2,6

2,6
3,7

Figure 2.19: A DFA for the language of strings whose digits add to a multiple
of 4

0,1

0,1

0,1
0,1

Figure 2.20: DFAs for the languages , ; and {"}

2.4. MORE EXAMPLES

29

a) The language of strings of length at least two that begin with 0 and
end in 1.
b) The language of strings of length at least two that have a 1 as their
second symbol.
c) The language of strings of length at least k that have a 1 in position
k. Do this in general, for every k 1. (You did the case k = 2 in part
(b).)
2.4.3. Give DFAs for the following languages. In all cases, the alphabet is {0, 1}.
a) The language of strings of length at least two whose first two symbols
are the same.
b) The language of strings of length at least two whose last two symbols
are the same.
c) The language of strings of length at least two that have a 1 in the
second-to-last position.
d) The language of strings of length at least k that have a 1 in position
k from the end. Do this in general, for every k 1. (You did the case
k = 2 in part (c).)
2.4.4. Give DFAs for the following languages. In all cases, the alphabet is {0, 1}.
a) The language of strings that contain at least one 1.
b) The language of strings that contain exactly one 1.
c) The language of strings that contain at least two 1s.
d) The language of strings that contain less than two 1s.
e) The language of strings that contain at least k 1s. Do this in general,
for every k 0. (You did the case k = 2 in part (c).)

30

CHAPTER 2. FINITE AUTOMATA

2.4.5. Give DFAs for the following languages. In both cases, the alphabet is the
set of digits {0, 1, 2, . . . , 9}.
a) The language of strings that represent a multiple of 3. For example,
the string 036 is in the language because 36 is a multiple of 3.
b) The generalization of the previous language where 3 is replaced by
any number k 2.
2.4.6. This exercise asks you to show that DFAs can add, at least when the numbers are presented in certain ways. Consider the alphabet that consists of
symbols of the form [abc ] where a, b and c are digits. For example,
[631] and [937] are two of the symbols in this alphabet. If w is a string
of digits, let n(w) denote the number represented by w. For example, if
w is the string 428, then n(w) is the number 428. Now give DFAs for the
following languages.
a) The language of strings of the form [ x 0 y0 z0 ][ x 1 y1 z1 ] [ x n yn zn ]
such that
n(x n x 1 x 0 ) + n( yn y1 y0 ) = n(zn z1 z0 ).
For example, [279][864][102] is in the language because 182 +
67 = 249. (This language corresponds to reading the numbers from
right to left and position by position. Note that this is how we read
numbers when we add them by hand.)
b) The language of strings of the form [ x n yn zn ] [ x 1 y1 z1 ][ x 0 y0 z0 ]
such that
n(x n x 1 x 0 ) + n(x n x 1 x 0 ) = n(x n x 1 x 0 ).
For example, [102][864][279] is in the language because 182 +
67 = 249. (This time, the numbers are read from left to right.)

2.5. CLOSURE PROPERTIES

31

q0

0, 1

1
0

q1

q2

q3

Figure 2.21: A DFA for the language of strings that contain the substring 001

2.5

Closure Properties

In the previous section, we considered the language of strings that contain the
substring 001 and designed a DFA that recognizes it. The DFA is shown again
in Figure 2.21. Suppose that we are now interested in the language of strings
that do not contain the substring 001. Is that language regular?
A quick glance at the DFA should reveal a solution: since strings that contains
001 end up in state 001 while strings that do not contain 001 end up in one of
the other states, all we have to do is switch the acceptance status of every state
in the above DFA to obtain a DFA for this new language. So the answer is, yes,
the new language is regular.
This new language is the complement of the first one.4 And the above technique should work for any regular language L: by switching the states of a DFA
for L, we should get a DFA for L.
In a moment, we will show in detail that the complement of a regular language is always regular. This is an example of what is called a closure property.
Definition 2.19 A set is closed under an operation if applying that operation to
elements of the set results in an element of the set.
4

Note that to define precisely what is meant by the complement of a language, we need
to know the alphabet over which the language is defined. If L is a language over , then its
complement is defined as follows: L = {w | w
/ L}. In other words, L = L.

32

CHAPTER 2. FINITE AUTOMATA

For example, the set of natural numbers is closed under addition and multiplication but not under subtraction or division because 2 3 is negative and
1/2 is not an integer. The set of integers is closed under addition, subtraction
and multiplication but not under division. The set of rational numbers
p is closed
thos four operation but not under the square root operation since 2 is not a
rational number.5
Theorem 2.20 The class of regular languages is closed under complementation.
Proof Suppose that A is regular and let M be a DFA for A. We construct a DFA
M 0 for A.
Let M 0 be the result of switching the accepting status of every state in M .
More precisely, if M = (Q, , , q0 , F ), then M 0 = (Q, , , q0 , Q F ). We claim
that L(M 0 ) = A.
To prove that, suppose that w A. Then, in M , w leads to an accepting state.
This implies that in M 0 , w leads to the same state but this state is non-accepting
in M 0 . Therefore, M 0 rejects w.
A similar argument shows that if w
/ A, then M accepts w. Therefore,
L(M 0 ) = A.
t
u
Note that the proof of this closure property is constructive: it establishes the
existence of a DFA for A by providing an algorithm that constructs that DFA.
Proofs of existence are not always constructive. But when they are, the algorithms they provide are often useful.
At this point, it is natural to wonder if the class of regular languages is closed
under other operations. A natural candidate is the union operation: if A and B
5

The proof of this fact is a nice example


of a proof by contradiction and of the usefulness of
p
basic number
theory.
Suppose
that
2
is
a
rational number. Let a and b be positive integers
p
such that 2 = a/b. Since fractions can be simplified, we can assume that a and b have no
common factors (other than 1). Now, 2 = a2 /b2 and 2b2 = a2 . This implies that a is even and
that 4 divides a2 . But then, 4 must also divide 2b2 , which implies that 2 divides b. Therefore, 2
divides both a and b, contradicting the fact that a and b have no common factors.

2.5. CLOSURE PROPERTIES

33

are two languages over , then the union of A and B is A B = {w | w


A or w B}.6
For example, consider the language of strings that contain a number of 1s
thats either a multiple of 2 or a multiple of 3. Call this language L. It turns out
that L is the union of two languages we have seen before:
L ={w {0, 1} | the number of 1s in w is a multiple of 2}
{w {0, 1} | the number of 1s in w is a multiple of 3}
In the previous section, we constructed DFAs these two simpler languages.
And a DFA for L can be designed by essentially simulating the DFAs for these
two simpler languages in parallel. That is, let M1 and M2 be DFAs for the simpler
languages. A DFA for L will store the current state of both M1 and M2 , and
update these states according to the transition functions of these DFAs. This can
be implemented by having each state of the DFA for L be a pair that combines
a state of M1 with a state of M2 . The result is the DFA of Figure 2.22. The 1
transitions essentially add 1 to each of the numbers stored in each state. The
first number counts modulo 2 while the number counts modulo 3. The accepting
states are those for which either count is 0.
This idea can be generalized to show that the union of any two regular languages is always regular.
Theorem 2.21 The class of regular languages is closed under union.
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFAs for these
languages. We construct a DFA M that recognizes A1 A2 .
6

We could also consider the union of languages that are defined over different alphabets. In
that case, the alphabet for the union would be the union of the two underlying alphabets: if A is
a language over 1 and B is a language over 2 , then A B = {w (1 2 ) | w A or w B}.
In these notes, we will normally consider only the union of languages over the same alphabet
because this keeps things simpler and because this is probably the most common situation that
occurs in practice.

34

CHAPTER 2. FINITE AUTOMATA

1
0

q00

q01

q02

q10

q11

q12

0
1

Figure 2.22: A DFA for the language of strings that contain a number of 1s thats
either a multiple of 2 or a multiple of 3

2.5. CLOSURE PROPERTIES

35

The idea, as explained above, is that M is going to simulate M1 and M2 in


parallel. More precisely, if after reading a string w, M1 would be in state r1 and
M2 would be in state r2 , then M , after reading the string w, will be in state
(r1 , r2 ).
Here are the full details. Suppose that Mi = (Q i , , i , qi , Fi ), for i = 1, 2.
Then let M = (Q, , , q0 , F ) where
Q = Q1 Q2
q0 = (q1 , q2 )
F = {(r1 , r2 ) | r1 F1 or r2 F2 }
and is defined as follows:
((r1 , r2 ), a) = (1 (r1 , a), 2 (r2 , a)),

for every r1 Q 1 , r2 Q 2 and a .

Because the start state of M consists of the start states of M1 and M2 , and
because M updates its state according to the transition functions of M1 and M2 ,
it should be clear that after reading a string w, M will be in state (r1 , r2 ) where
r1 and r2 are the states that M1 and M2 would be in after reading w.7 Now, if
w A1 A2 , then either r1 F1 or r2 F2 , which implies that M accepts w. The
reverse is also true. Therefore, L(M ) = A1 A2 .
u
t
Corollary 2.22 The class of regular languages is closed under intersection.
7

We could provide more details here, if we thought our readers needed them. The idea is
to refer to the details in the definition of acceptance. Suppose that w = w1 w n is a string
of length n and that r0 , r1 , . . . , rn is the sequence of states that M1 goes through while reading
w. Suppose that s0 , s1 , . . . , sn is the sequence of states that M2 goes through while reading w.
Then (r0 , s0 ), (r1 , s1 ), . . . , (rn , sn ) is the sequence of states that M goes through while reading w
because q0 = (q1 , q2 ) = (r0 , s0 ) and, for every i > 0,
((ri1 , si1 ), w i1 ) = (1 (ri1 , w i1 ), 2 (si1 , w i1 )) = (ri , si ).
(By the principle of mathematical induction.)

36

CHAPTER 2. FINITE AUTOMATA

Proof In the proof of the previous theorem., change the definition of F as


follows:
F = {(r1 , r2 ) | r1 F1 and r2 F2 }.
In other words, F = F1 F2 .

t
u

So we now know that the class of regular languages is closed under the three
basic set operations: complementation, union and intersection. And the proofs
of these closure properties are all constructive.
We end this section with another operation thats specific to languages. If A
and B are two languages over , then the concatenation of A and B is
AB = {x y | x A and y B}.
That is, the concatenation of A and B consists of those strings we can form, in
all possible ways, by taking a string from A and appending to it a string from B.
For example, suppose that A is the language of strings that consist of an even
number of 0s (no 1s) and that B is the language of strings that consist of an
odd number of 1s (no 0s). That is,
A = {0k | k is even}
B = {1k | k is odd}
Then AB is the language of strings that consist of an even number of 0s followed
by an odd number of 1s:
AB = {0i 1 j | i is even and j is odd}.
Heres another example that will demonstrate the usefulness of both the
union and concatenation operations. Let N be the language of numbers defined in Exercise 2.2.4: strings that consist of an optional minus sign followed
by at least one digit, or an optional minus sign followed by any number of digits, a decimal point and at least one digit. Let D denote the language of strings

2.5. CLOSURE PROPERTIES

37

0
q0

1
q1

q10

q11

Figure 2.23: A DFA for the language AB


that consist of any number of digits. Let D+ denote the language of strings that
consist of at least one digit. Then the language N can be defined as follows:
N = {" }D+ {" }D {.}D+ .
In other words, a language that took 33 words to describe can now defined with
a mathematical expression thats about a third of a line long. In addition, the
mathematical expression helps us visualize what the strings of the language look
like.
The obvious question now is whether the class of regular languages is closed
under concatenation? And whether we can prove this closure property constructively.
Closure under concatenation is trickier to prove. In many cases, it is easy to
design a DFA for the concatenation of two particular languages. For example,
Figure 2.23 shows a DFA for the language AB mentioned above. (As usual, missing transitions go to a garbage state.) This DFA is essentially a DFA for A, on the
left, connected to a DFA for B, on the right. As soon as the DFA on the left sees a
1 while in its accepting state, the computation switches over to the DFA on the
right.
The above example suggests an idea for showing that the concatenation of
two regular languages is always regular. Suppose that M1 and M2 are DFAs for

38

CHAPTER 2. FINITE AUTOMATA

0
M1

M2

0
1
1
0
1

Figure 2.24: An idea for showing that the class of regular languages is closed
under concatenation

A and B. We want a DFA for AB to accept a string in A followed by a string in B.


That is, a string that goes from the start state of M1 to an accepting state of M1 ,
followed by a string that goes from the start state of M2 to an accepting state of
M2 . So what about we add to each accepting state of M1 all the transitions that
come out of the starting state of M2 ? This is illustrated by Figure 2.24. (The
new transitions are shown in a dashed pattern.) This will cause the accepting
states of M1 to essentially act as if they were the start state of M2 .
But there is a problem with this idea: the accepting states of M1 may now
have multiple transitions labeled by the same symbol. In other words, when
the DFA reaches an accepting state of M1 , it does not know whether to continue
computing in M1 or whether to switch to M2 . And this is the key difficulty in
proving closure under concatenation: given a string w, how can a DFA determine
where to split w into x and y so that x A and y B?
In the next chapter, we will develop the tools we need to solve this problem.

2.5. CLOSURE PROPERTIES

39

Study Questions
2.5.1. What does it mean for a set to be closed under a certain operation?
2.5.2. What is the concatenation of two languages?

Exercises
2.5.3. Give DFAs for the complement of each of the languages of Exercise 2.4.2.
2.5.4. Each of the following languages is the union or intersection of two simpler
languages. In each case, give DFAs for the two simpler languages and then
use the pair construction from the proof of Theorem 2.21 to obtain a DFA
for the more complex language. In all cases, the alphabet is {0, 1}.
a) The language of strings of length at least two that have a 1 in their
second position and also contain at least one 0.
b) The language of strings that contain at least two 1s or at least two
0s.
c) The language of strings that contain at least two 1s and at most
one 0.
d) The language of strings that contain at least two 1s and an even
number of 0s.
2.5.5. Give DFAs for the following languages. In all cases, the alphabet is
{0, 1, #}.
a) The language of strings of the form 0i #1 j where i is even and j 2.
b) The language of strings of the form 0i 1 j where i is even and j 2.

40

CHAPTER 2. FINITE AUTOMATA

Chapter 3
Nondeterministic Finite Automata
So far, all our finite automata have been deterministic. This means that each
one of their moves is completely determined by the current state and the current input symbol. In this chapter, we learn that finite automata can be nondeterministic. Nondeterministic finite automata are a useful tool for showing
that languages are regular. They are also a good introduction to the important
concept of nondeterminism.

3.1

Introduction

In this section, we introduce the concept of a nondeterministic finite automaton


(NFA) through some examples. In the next section, we will formally define what
we mean.
Consider again the language of strings that contain the substring 001. Figure 3.1 shows the DFA we designed in the previous chapter for this language.
Now consider the finite automaton of Figure 3.2. This automaton is not a
DFA for two reasons. First, some transitions are missing. For example, state q1
has no transition labeled 1. If the NFA is in that state and the next input symbol

42

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

q0

0, 1

1
0

q1

q2

q3

Figure 3.1: A DFA for the language of strings that contain the substring 001

0, 1

0, 1
q0

q1

q2

q3

Figure 3.2: An NFA for the language of strings that contain the substring 001

3.1. INTRODUCTION

43

is a 1, we consider that the NFA is stuck. It cant finish reading the input string
and is unable to accept it.
Second, some states have multiple transitions for the same input symbol. In
the case of this NFA, there are two transitions labeled 0 coming out of state q0 .
What this means is that when in that state and reading a 0, the NFA has a choice:
it can stay in state q0 or move to state q1 .
How does the NFA make that choice? We simply consider that if there is
an option that eventually leads the NFA to accept the input string, the NFA will
choose that option. In this example, if the input string contains the substring
001, the NFA will wait in state q0 until it reaches an occurrence of the substring
001 (there may be more than one), move to the accepting state as it reads the
substring 001, and then finish reading the input string while looping in the
accepting state.
On the other hand, if the input string does not contain the substring 001,
then the NFA will do something, but whatever it does will not allow it to reach
the accepting state. Thats because to move from the start state to the accepting
state requires that the symbols 0, 0 and 1 occur consecutively in the input string.
Heres another example. Consider the language of strings that contain a
number of 1s thats either a multiple of 2 or a multiple of 3. In the previous
chapter, we observed that this language is the union of two simpler languages
and then used the pair construction to obtain the DFA of Figure 2.22.
An NFA for this language is shown in Figure 3.3. It combines the DFAs of the
two simpler languages in a much simpler way than the pair construction.
This NFA illustrates another feature that distinguishes NFAs from DFAs: transitions that are labeled ". The NFA can use these " transitions without reading
any input symbols. This particular NFA has two " transitions, both coming out
of the start state. This means that in its start state, the NFA has two options: it
can move to either state q10 or state q20 . And it makes that choice before reading
the first symbol of the input string.
This NFA operates as described earlier: given multiple options, if there is one
that eventually leads the NFA to accept the input string, the NFA will choose that

44

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

0
1

q10

q11

"
q0

"
q20

q21

q22

Figure 3.3: An NFA for the language of strings that contain a number of 1s thats
either a multiple of 2 or a multiple of 3

3.1. INTRODUCTION

45

0
1
q000

q001

q101

q011

1
0

q100

q010

q110

1
q111

0
1

Figure 3.4: A DFA for the language of strings of length at least 3 that have a 1
in position 3 from the end
option. In this example, if the number of 1s in the input string is a multiple of 2
but not a multiple of 3, the NFA will choose to move to state q10 . If the number
of 1s is a multiple of 3 but not a multiple of 2, the NFA will choose to move to
state q20 . If the number of 1s is both a multiple of 2 and a multiple of 3, the
NFA will choose to move to either state q10 or q20 , and eventually accept either
way. If the number of 1s is neither a multiple of 2 nor a multiple of 3, then the
NFA will not be able to accept the input string.
Heres one more example. Consider the language of strings of length at least
3 that have a 1 in position 3 from the end. We can design a DFA for this language
by having the DFA remember the last three symbols it has seen, as shown in
Figure 3.4. The start state is q000 , which corresponds to assuming that the input
string is preceded by 000. This eliminates special cases while reading the first
two symbols of the input string.
Figure 3.5 shows an NFA for the same language. This NFA is surprisingly

46

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

0, 1
q0

q1

0, 1

q2

0, 1

q3

Figure 3.5: An NFA for the language of strings of length at least 3 that have a 1
in position 3 from the end
simpler than the DFA. If the input string does contain a 1 in position 3 from the
end, the NFA will wait in the start state until it reaches that 1 and then move to
the accepting state as it reads the last three symbols of the input string.
On the other hand, if the input string does not contain a 1 in position 3 from
the end, then the NFA will either fall short of reaching the accepting state or it
will reach it before having read the last symbol of the input string. In that case,
it will be stuck, unable to finish reading the input string. Either way, the input
string will not be accepted.
This example makes two important points. One is that NFAs can be much
simpler than DFAs that recognize the same language and, consequently, it can
be much easier to design an NFA than a DFA. Second, we consider that an NFA
accepts its input string only if it is able to read the entire input string.
To summarize, an NFA is a finite automaton that may have missing transitions, multiple transitions coming out of a state for the same input symbol, and
transitions labeled " that can be used without reading any input symbol. An
NFA accepts a string if there is a way for the NFA to read the entire string and
end up in an accepting state.
Earlier, we said that when confronted with multiple options, we consider
that the NFA makes the right choice: if there is an option that eventually leads
to acceptance, the NFA will choose it. Now, saying that the NFA makes the right
choice, if there is one, does not explain how the NFA makes that choice. One

3.1. INTRODUCTION

47

way to look at this is to simply pretend that the NFA has the magical ability to
make the right choice. . .
Of course, real computers arent magical. In fact, theyre deterministic. So
a more realistic way of looking at the computation of an NFA is to imagine that
the NFA explores all the possible options, looking for one that will allow it to
accept the input string. So its not that NFAs are magical, its that they are
somewhat misleading: when we compare the NFA of Figure 3.5 with the DFA of
Figure 3.4, the NFA looks much simpler but, in reality, it hides a large amount
of computation.
So why are we interested in NFAs if they arent a realistic model of computation? One answer is that they are useful. We will soon learn that there is a
simple algorithm that can convert any NFA into an equivalent DFA, one that recognizes the same language. And for some languages, it can be much easier to
design an NFA than a DFA. This means that for those languages, it is much easier
to design an NFA and then convert it to a DFA than to design the DFA directly.
Another reason for studying NFAs is that they are a good introduction to
the concept of nondeterminism. In the context of finite automata, NFAs are no
more powerful than DFAs: they do not recognize languages that cant already
be recognized by DFAs. But in other contexts, nondeterminism seems to result
in additional computational power.
Without going into the details, heres the most famous example. Algorithms
are generally considered to be efficient if they run in polynomial time. There
is a wide variety of languages that can be recognized by polynomial-time algorithms. But there are also many others that can be recognized by nondeterministic polynomial-time algorithms but for which no deterministic polynomial-time
algorithm is known. It is widely believed that most of these languages cannot
be recognized by deterministic polynomial-time algorithms. After decades of effort, researchers are still unable to prove this but their investigations have led
to deep insights into the complexity of computational problems.1
1

What we are referring to here is the famous P vs NP problem and the theory of NP-

48

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

Study Questions
3.1.1. What is an NFA?
3.1.2. What does it mean for an NFA to accept a string?

Exercises
3.1.3. Give NFAs for the following languages. Each NFA should respect the specified limits on the number of states and transitions. (Transitions labeled
with two symbols count as two transitions.) In all cases, the alphabet is
{0, 1}.
a) The language of strings of length at least two that begin with 0 and
end in 1. No more than three states and four transitions.
b) The language of strings of length at least two whose last two symbols
are the same. No more than four states and six transitions.
c) The language of strings of length at least two that have a 1 in the
second-to-last position. No more than three states and five transitions.
completeness. Let P be the class of languages that can be recognized by deterministic algorithms that run in polynomial time. Let NP be the class of languages that can be recognized by
nondeterministic algorithms that run in polynomial time. Then proving that there are languages
that can be recognized by nondeterministic polynomial-time algorithms but not by deterministic
polynomial-time algorithms is equivalent to showing that P is a strict subset of NP. The consensus among experts is that this is true but no proof has yet been discovered. However, researchers
have discovered an amazing connection between a wide variety of languages that belong to NP
but are not known to belong to P: if a single one of these languages was shown to belong to P,
then every language in NP, all of them, would belong to P. This property is called NP-completeness.
The fact that a language is NP-complete is considered strong evidence that there are no efficient
algorithms that recognize it. The P vs NP problem is considered one of the most important open
problems in all of mathematics, as evidenced by the fact that the Clay Mathematics Institute has
offered a million dollar prize to the first person who solves it.

3.2. FORMAL DEFINITION

49

d) The language of strings of length at least k that have a 1 in position


k from the end. Do this in general, for every k 1. No more than
k + 1 states and 2k + 1 transitions. (You did the case k = 2 in the
previous part.)
e) The language of strings that contain exactly one 1. No more than
two states and three transitions.

3.2

Formal Definition

In this section, we formally define what an NFA is. As was the case with DFAs, a
formal definition is necessary for proving mathematical statements about NFAs
and for writing programs that manipulate NFAs.
As explained in the previous section, an NFA is a DFA that can have missing
transitions, multiple transitions coming out of a state for the same input symbol,
and transitions labeled ". The first two features can be captured by having the
transition function return a possibly empty set of states: (q, a) will be the set
of options available to the NFA from state q when reading symbol a. Transitions
labeled " can be described by extending the transition function: (q, ") will be
the set of states that can be reached from state q by " transitions.
Definition 3.1 A nondeterministic finite automaton (NFA) is a 5-tuple
(Q, , , q0 , F ) where
1. Q is a finite set of states.
2. is an alphabet called the input alphabet.
3. : Q ( ") P (Q) is the transition function.
4. q0 Q is the starting state.
5. F Q is the set of accepting states.

50

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

0, 1

0, 1
q0

q1

q2

q3

Figure 3.6: An NFA for the language of strings that contain the substring 001

"

q0 {q0 , q1 } {q0 } ;
q1

{q2 }

q2

{q3 } ;

q3

{q3 }

{q3 } ;

Figure 3.7: The transition function of the NFA of Figure 3.6.


Example 3.2 Consider the NFA shown in Figure 3.6. Here is the formal description of this NFA: the NFA is (Q, {0, 1}, , q0 , F ) where
Q = {q0 , q1 , q2 , q3 }
F = {q3 }
and is defined by the table shown in Figure 3.7.
Example 3.3 Consider the NFA shown in Figure 3.8.
(Q, {0, 1}, , q0 , F ) where
Q = {q0 , q10 , q11 , q20 , q21 , q22 }
F = {q10 , q20 }

t
u
This NFA is

3.2. FORMAL DEFINITION

51

0
1

q10

q11

"
q0

"

q20

q21

q22

Figure 3.8: An NFA for the language of strings that contain a number of 1s thats
either a multiple of 2 or a multiple of 3

"

q0

q10 , q11

q10 q10 q11

q11 q11 q10

q20 q20 q21

q21 q21 q22

q22 q22 q20

Figure 3.9: The transition function of the NFA of Figure 3.8.

52

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

and is defined by the table shown in Figure 3.9. Note that in this table, we
omitted the braces and used a dash () instead of the empty set symbol (;). We
will often do that to avoid cluttering these tables.
t
u
We now define what it means for an NFA to accept an input string. We first
do this for NFAs that dont contain any " transitions.
In the case of DFAs (see Definition 2.5), we were able to talk about the sequence of states that the DFA goes through while reading the input string. That
was because that sequence was unique. But in the case of NFAs, there may
be multiple sequences of states for each input string, depending on how many
options the NFA has.
For example, consider the NFA of Figure 3.6. While reading the string w =
0001, the NFA could go through either of the following two sequences of states:

0
q0
q0

0
q0
q0

0
q0
q1

1
q0
q2

q0
q3

We consider that the NFA accepts because one of these sequences, the second
one, leads to an accepting state.
Definition 3.4 Let N = (Q, , , q0 , F ) be an NFA without " transitions and let
w = w1 w n be a string of length n over . Then N accepts w if and only if there
is a sequence of states r0 , r1 , . . . , rn such that
r0 = q0
ri (ri1 , w i ),
rn F.

(3.1)
for i = 1, . . . , n

(3.2)
(3.3)

Equations 3.1 and 3.2 assert that the sequence of states r0 , r1 , . . . , rn is one
of the possible sequences of states that the NFA may go through while reading
w. Equation 3.3 says that this sequence of states leads to an accepting state.

3.2. FORMAL DEFINITION

53

0, 1

1
q0

q1

"

q2

q3

"

Figure 3.10: Another NFA for the language of strings that contain the substring
001
We now define acceptance for NFAs that may contain " transitions. This
is a little trickier. Figure 3.10 shows another NFA for the language of strings
that contain the substring 001. Consider again the string w = 0001. The NFA
accepts this string because it could go through the following sequence of states
as it reads w:
"

0
q0

q1

0
q0

0
q1

1
q3

q2

For the NFA to go through this sequence of states, the NFA needs to essentially
insert an " between the first and second symbols of w. That is, it needs to view
w as 0" 001.
Another possible sequence of states is

0
q0

"

0
q1

q2

0
q1

1
q2

q3

In this case, the NFA inserts an " between the second and third symbols of w.
That is, it views w as 00" 01.
A definition of acceptance for NFAs with " transitions can be based on this
idea of inserting " into the input string. Note that inserting " into a string does

54

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

not change its value; that is, as strings, we have that 0001 = 0" 001 =
00" 01. (In fact, for every string x, x" = " x = x. We say that " is an neutral
element with respect to concatenation of strings.)
Definition 3.5 Let N = (Q, , , q0 , F ) be an NFA and let w = w1 w n be a string
of length n over . Then N accepts w if and only if w can be written as y1 ym ,
with yi " for every i = 1, . . . , m, and there is a sequence of states r0 , r1 , . . . , rm
such that
r0 = q0
ri (ri1 , yi ),

(3.4)
for i = 1, . . . , m

rm F.

(3.5)
(3.6)

Note that in this definition, m could be greater than n. In fact, m n equals


the number of " transitions that the NFA needs to use to go through the sequence
of states r0 , . . . , rm as it reads w. Once again, Equations 3.4 and 3.5 assert that
this sequence of states is one of the possible sequences of states the NFA may
go through while reading w, and Equation 3.6 says that this sequence of states
leads to an accepting state.

Exercises
3.2.1. Give a formal description of the NFA of Figure 3.5.
3.2.2. Give a formal description of the NFA of Figure 3.10.
3.2.3. Consider the NFA of Figure 3.6. There are three possible sequences of
states that this NFA could go through while reading the string 001001.
What are they? Does the NFA accept this string?
3.2.4. Consider the NFA of Figure 3.10. There are seven possible sequences of
states that this NFA could go through while reading the string 001001.
What are they? Indicate where "s need to be inserted into the string.
Does the NFA accept this string?

3.3. EQUIVALENCE WITH DFAS

55

0,1
q0

q1

Figure 3.11: An NFA for the language of strings that end in 1

3.3

Equivalence with DFAs

Earlier in this chapter, we mentioned that there is a simple algorithm that can
convert any NFA into an equivalent DFA, that is, one that recognizes the same
language. In this section, we are going to learn what this algorithm is. Since
DFAs are just a special case of NFAs, this will show that DFAs and NFAs recognize
the same class of languages.
Consider the NFA of Figure 3.11. This NFA recognizes the language of strings
that end in 1. Its not the simplest possible NFA for this language, but if an
algorithm is going to be able to convert NFAs into DFAs, it has to be able to
handle any NFA.
Now consider the input string w = 1101101. The diagram of Figure 3.12
shows all the possible ways in which the NFA could process this string. This
diagram is called the computation tree of the NFA on string w. The nodes in this
tree are labeled by states. The root of the tree (shown at the top) represents
the beginning of the computation. Edges show possible moves that the NFA can
make. For example, from the start state, the NFA has two options when reading
symbol 1: stay in state q0 or go to state q1 .
Some nodes in the computation tree are dead ends. For example, the node
labeled 1 at level q1 is a dead end because there is no 1 transition coming out
of state q1 . (Note that the top level of a tree is considered to be level 0.)

56

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

q0

q0

q1

q0

q1

q0

q0

q0

q1

q0

q1

q0

q1

q0

1 1

q0

q0

q1

q1

q0

q0

q0

1 1
q1

q0

q0

1 1
q1

q0

1
q1

Figure 3.12: The computation tree of the NFA of Figure 3.11 on input string
1101101

3.3. EQUIVALENCE WITH DFAS

57

The nodes at each level of the computation tree show all the possible states
that the NFA could be in after reading a certain portion of the input string. For
example, level 4 has nodes labeled q0 , q1 , q0 , q1 . These are the states that the
NFA could be in after reading 1101.
The nodes at the bottom level show the states that the NFA could be in after
reading the entire string. We can see that this NFA accepts this input string
because the bottom level contains the accepting state q1 .
Computation trees help us visualize all the possible ways in which an NFA can
process a particular input string. But they also suggest a way of simulating NFAs:
as we read the input string, we can move down the computation tree, figuring
out what nodes occur at each of its levels. And we dont need to remember
the entire tree, only the nodes at the current level, the one that corresponds to
the last input symbol that was read.2 For example, in the computation tree of
Figure 3.12, after reading the string 1101, we would have figured out that the
current level of the computation tree consists of states q0 , q1 , q0 , q1 .
How would a DFA carry out this simulation? In the only other simulation we
have seen so far, the pair construction of Theorem 2.21, the DFA M simulates
the DFAs M1 and M2 by keeping track of the state in which each of these DFAs
would be. Each of the states of M is a pair that combined a state of M1 with a
state of M2 .
In our case, we would have each state of the DFA that simulates an NFA be
the sequence of states that appear at the current level of the computation tree
of the NFA on the input string. However, as the computation tree of Figure 3.12
clearly indicates, the number of nodes at each level can grow. In general, if an
NFA has two options at each move, which is certainly possible, then the bottom
level of the tree would contain 2n nodes, where n is the length of the input string.
This implies that there may be an infinite number of possible sequences of states
that can appear at any level of the computation tree. Since the DFA has a finite
2

Some of you may recognize that this is essentially a breadth-first search. At Clarkson, this
algorithm and other important graph algorithms are normally covered in the course CS344 Algorithms and Data Structures.

58

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

number of states, it cant have a state for each possible sequence.


A solution to this problem comes from noticing that most levels of the tree
of Figure 3.12 contain a lot of repetition. And we only want to determine if
an accepting state occurs at the bottom level of the tree, not how many times
it occurs, or how many different ways it can be reached from the start state.
Therefore, as it reads the input string, the DFA only needs to remember the set
of states that occur at the current level of the tree (without repetition). This
corresponds to pruning the computation tree by eliminating duplicate subtrees,
as shown in Figure 3.13. The pruning locations are indicated by dots ( ).
Another way of looking at this is to say that as it reads the input string, the
DFA simulates the NFA by keeping track of the set of states that the NFA could
currently be in. If the NFA has k states, then there are 2k different sets of states.
Since k is a constant, 2k is also a constant. This implies that the DFA can have a
state for each set of states of the NFA.
Here are the details of the simulation, first for the case of NFAs without "
transitions.
Theorem 3.6 Every NFA without " transitions has an equivalent DFA.
Proof Suppose that N = (Q, , , q0 , F ) is an NFA without " transitions. As
explained above, we construct a DFA M that simulates N by keeping track of the
set of states that N could currently be in. More precisely, each state of M will be
a set of states of N . If after reading a string w, R is the set of states that N could
be in, then M will be in state R.
We need to specify the start state, the accepting states and the transition
function of M . Initially, N can only be in its start state so the start state of M
will be {q0 }.
NFA N accepts a string w if and only if the set of states it could be in after
reading w includes at least one accepting state. Therefore, the accepting states
of M will be those sets R that contain at least one state from F .
If R is the set of states that N can currently be in, then after reading one
more symbol a, N could be in any state that can be reached from a state in R by

3.3. EQUIVALENCE WITH DFAS

59

q0

q0

1
q0

1
q1

q0

q0

q1

q0

q1

q0

q1

q0

q0

q1

q0

Figure 3.13: The computation tree of Figure 3.12 with duplicate subtrees removed

60

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

a transition labeled a. Therefore, from state R, M will have a transition labeled


a going to state
{q Q | q (r, a), for some r R}.
We can summarize this more concisely as follows: M = (Q0 , , 0 , q00 , F 0 )
where
Q0 = P (Q)
q00 = {q0 }
F 0 = {R Q | R F 6= ;}
and 0 is defined as follows:
0 (R, a) =

(r, a),

for R Q and a .

rR

From the above description of M , it should be clear that L(M ) = L(N ).

u
t

Example 3.7 Lets construct a DFA that simulates the NFA of Figure 3.11. We
get M = (Q0 , {0, 1}, 0 , q00 , F 0 ) where
Q0 = {;, {q0 }, {q1 }, {q0 , q1 }}
q00 = {q0 }
F 0 = {{q1 }, {q0 , q1 }}
and 0 is defined by the table shown in Figure 3.14.
The entries in this table were computed as follows. First, 0 (;, a) = ; because
an empty union gives an empty set. (This is consistent with the fact that no states
can be reached from any of the states in an empty set.) Then 0 ({q0 }, a) =
(q0 , a) and similarly for {q1 }. Finally,
0 ({q0 , q1 }, a) = (q0 , a) (q1 , a) = ({q0 }, a) ({q1 }, a)
so that the 0 ({q0 , q1 }, a) values can be computed from the others.

3.3. EQUIVALENCE WITH DFAS

61

{q0 }

{q0 } {q0 , q1 }

{q1 }

{q0 }

{q0 , q1 } {q0 } {q0 , q1 }


Figure 3.14: The transition function of the DFA that simulates the NFA of Figure 3.11

1
0,1

{q0 }

{q1 }

{q0 , q1 }

Figure 3.15: The DFA that simulates the NFA of Figure 3.11

62

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

1
0
1
{q0 }

{q0 , q1 }

Figure 3.16: A simplified DFA that simulates the NFA of Figure 3.11
Figure 3.15 shows the transition diagram of the DFA. Note that state {q1 }
cannot be reached from the start state. This means that it can be removed from
the DFA. And if state {q1 } is removed, then state ; also becomes unreachable
from the start state. So it can be removed too. This leaves us with the DFA of
Figure 3.16. Note that apart from the names of the states, this DFA is identical
to the DFA that we directly designed in the previous chapter (see Figure 2.12).
t
u
We now tackle the case of NFAs with " transitions. The DFA now needs to account for the fact that the NFA may use any number of " transitions between any
two input symbols, as well as before the first symbol and after the last symbol.
The following concept will be useful.
Definition 3.8 The extension of a set of states R, denoted E(R), is the set of states
that can be reached from any state in R by using any number of " transitions (none
included).3
Example 3.9 Figure 3.17 shows an NFA for the language of strings that contain
the substring 001. In this NFA, we have the following:
3

This can be defined more formally as follows: E(R) is the set of states q for which there is a

3.3. EQUIVALENCE WITH DFAS

63

0, 1

1
q0

q1

"

q2

q3

"

Figure 3.17: An NFA for the language of strings that contain the substring 001
E({q0 }) = {q0 }
E({q1 }) = {q0 , q1 }
E({q2 }) = {q0 , q1 , q2 }
E({q2 , q3 }) = {q0 , q1 , q2 , q3 }.
t
u
Theorem 3.10 Every NFA has an equivalent DFA.
Proof Suppose that N = (Q, , , q0 , F ) is an NFA. We construct a DFA M that
simulates N pretty much as in the proof of the previous theorem. In particular,
if R is the set of states that N could be in after reading a string w and using any
number of " transitions, then M will be in state R.
One difference from the previous theorem is that M will not start in state {q0 }
but in state E({q0 }). This will account for the fact that N can use any number of
" transitions even before reading the first input symbol.
sequence of states r0 , . . . , rk , for some k 0, such that
r0 R
ri (ri1 , "),
rk = q.

for i = 1, . . . , k

64

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

The other difference is that from state R, M will have a transition labeled a
going to state
{q Q | q E((r, a)), for some r R}.
This will account for the fact that N can use any number of " transitions after
reading each input symbol (including the last one).
This is what we get: M = (Q0 , , 0 , q00 , F 0 ) where
Q0 = P (Q)
q00 = E({q0 })
F 0 = {R Q | R F 6= ;}
and 0 is defined as follows:
[
E((r, a)),
0 (R, a) =

for R Q and a .

rR

It should be clear that L(M ) = L(N ).

t
u

Example 3.11 Lets construct a DFA that simulates the NFA of Figure 3.17. The
easiest way to do this by hand is usually to start with the transition table and
proceed as follows:
1. Compute 0 ({r}, a) = E((r, a)) for the individual states r of the NFA.
Those values are shown in the top half of Figure 3.18.
2. Identify the start state. In this case its E({q0 }) = {q0 }.
3. Add states as needed (to the bottom half of the table) until no new states
are introduced. The value of the transition function for these states is
computed from the values in the top half of the table by taking advantage
of the following fact:
[
[
0 (R, a) =
E((r, a)) =
0 ({r}, a).
rR

rR

3.3. EQUIVALENCE WITH DFAS

65

q0

q0 , q1

q0

q1

q0 , q1 , q2

q2

q3

q3

q3

q3

q0 , q1

q0 , q1 , q2

q0

q0 , q1 , q2

q0 , q1 , q2

q0 , q3

q0 , q3

q0 , q1 , q3

q0 , q3

q0 , q1 , q3

q0 , q1 , q2 , q3 q0 , q3

q0 , q1 , q2 , q3 q0 , q1 , q2 , q3 q0 , q3
Figure 3.18: The transition function of the DFA that simulates the NFA of Figure 3.17

66

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

0
1
1
q0

q0 , q1

q0 , q1 , q2

q0 , q3

1
1

q0 , q1 , q3

0
q0 , q1 , q2 , q3

Figure 3.19: A DFA that simulates the NFA of Figure 3.17

Note that in this table, we omitted the braces and used a dash () instead of
the empty set symbol (;).
All that we have left to do now is to identify the accepting states of the DFA.
In this case, theyre the sets of states that contain q3 , the accepting state of the
NFA.
Figure 3.19 shows the transition diagram of the DFA. Note that the three
accepting states can be merged since all the transitions coming out of these states
stay within this group of states. If we merged these states, then this DFA would
be identical to the DFA we designed in the previous chapter (see Figure 2.14). t
u

3.3. EQUIVALENCE WITH DFAS

67

Example 3.12 Lets construct a DFA that simulates the NFA of Figure 3.8. Figure 3.20 shows the table of the transition function of the DFA. The start state is
E({q0 }) = {q0 , q10 , q20 }, which is why this state is the first one that appears in
the bottom half of the table. The accepting states are all the states that contain
either q10 or q20 , the two accepting states of the NFA.
Note that states {q0 , q10 , q20 } and {q10 , q20 } could be merged since they are
both accepting and their transitions lead to exactly the same states. If we did
that, then we would find that this DFA is identical to the DFA we designed in the
previous chapter (see Figure 2.22).
By the way, since the NFA has 6 states, in principle, the DFA has 26 = 64
states. But as the transition table indicates, there are only 7 states that are
reachable from the start state.
t
u
In this section, we used the technique described in the proofs of Theorems 3.6
and 3.10 to convert three NFAs into equivalent DFAs. Does that mean that these
proofs are constructive? Yes but with one caveat: in the proof of Theorem 3.10,
we didnt specify how to compute the extension of a set of states. This is essentially what is called a graph reachability problem. We will learn a graph
reachability algorithm later in these notes.
Another observation. A language is regular if it is recognized by some DFA.
This is the definition. But now we know that every NFA can be simulated by a
DFA. We also know that a DFA is a special case of an NFA. Therefore, we get the
following alternate characterization of regular languages:
Corollary 3.13 A language is regular if and only if it is recognized by some NFA.

Study Questions
3.3.1. In an NFA, what is the extension of a state?

68

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

q0

q10

q10

q11

q11

q11

q10

q20

q20

q21

q21

q21

q22

q22

q22

q20

q0 , q10 , q20 q10 , q20 q11 , q21


q11 , q21

q11 , q21 q10 , q22

q10 , q22

q10 , q22 q11 , q20

q11 , q20

q11 , q20 q10 , q21

q10 , q21

q10 , q21 q11 , q22

q11 , q22

q11 , q22 q10 , q20

q10 , q20

q10 , q20 q11 , q21

Figure 3.20: The transition function of the DFA that simulates the NFA of Figure 3.8

3.4. CLOSURE PROPERTIES

69

Exercises
3.3.2. Draw the computation tree of the NFA of Figure 3.6 on the input string
001001. Prune as needed.
3.3.3. By using the algorithm of this section, convert into DFAs the NFAs of
Figures 3.21 and 3.22.
3.3.4. By using the algorithm of this section, convert into DFAs the NFAs of
Figure 3.23.

3.4

Closure Properties

In Section 2.5, we proved that the class of regular languages is closed under
complementation, union and intersection. We also discussed closure under concatenation, realized that this was more difficult to establish, and announced that
we would learn the necessary tools in this chapter.
Well, that tool is the NFA. We saw a hint of that in the NFA of Figure 3.3. That
NFA recognized the union of two languages and was designed as a combination
of DFAs for those two languages. Now that we know that every NFA can be
simulated by a DFA, this gives us another way of showing that the class of regular
languages is closed under union.
Theorem 3.14 The class of regular languages is closed under union.
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFAs for these
languages. We construct an NFA N that recognizes A1 A2 .
The idea is illustrated in Figure 3.24. We add to M1 and M2 a new start state
that we connect to the old start states with " transitions. This gives the NFA the
option of processing the input string by using either M1 or M2 . If w A1 A2 ,
then N will choose the appropriate DFA and accept. And if N accepts, it must
be that one of the DFAs accepts w. Therefore, L(N ) = A1 A2 .

70

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

a)

0,1
0

q0

q1

q2

b)
q1

0,1

q0

q3

1
q2

c)

0, 1
q0

q1

0, 1

q2

Figure 3.21: NFAs for Exercise 3.3.3 (part 1 of 2)

3.4. CLOSURE PROPERTIES

71

d)

0
q0

q1

Figure 3.22: NFAs for Exercise 3.3.3 (part 2 of 2)


a)

0
0

q0

q1

q2

"
b)

0,1

q1

0, "

q0

q3

1, "

1
q2

Figure 3.23: NFAs for Exercise 3.3.4

72

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

M1

N
"

"
M2

Figure 3.24: An NFA for the union of two regular languages

3.4. CLOSURE PROPERTIES

73

The NFA can be described more precisely as follows. Suppose that Mi =


(Q i , , i , qi , Fi ), for i = 1, 2. Without loss of generality, assume that Q 1 and Q 2
are disjoint. (Otherwise, rename the states so the two sets become disjoint.) Let
q0 be a state not in Q 1 Q 2 . Then N = (Q, , , q0 , F ) where
Q = Q 1 Q 2 {q0 }
F = F1 F2
and is defined as follows:

(q, ") =

(q, a) =

{q1 , q2 }

if q = q0

otherwise

{i (q, a)}

if q Q i and a

if q = q0 and a .
u
t

In the previous chapter, we proved closure under union by using the pair
construction. And we observed that the construction could be easily modified
to prove closure under intersection. We cant do this here: there is no known
simple way to adapt the above construction for the case of intersection. But we
can still easily prove closure under intersection by using De Morgans Law.
Corollary 3.15 The class of regular languages is closed under intersection.
Proof Suppose that A1 and A2 are two languages. Then
A1 A2 = A1 A2 .
If A1 and A2 are regular, then A1 and A2 are regular, A1 A2 is regular, and then
so is A1 A2 . This implies that A1 A2 is regular.

t
u

74

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA


M1

M2
"

N
"

Figure 3.25: An NFA for the concatenation of two regular languages


Lets now turn to concatenation. Figure 2.24 illustrates one possible idea:
connect DFAs M1 and M2 in series by adding transitions so that the accepting
states of M1 act as the start state of M2 . By using " transitions, we can simplify
this a bit.
Theorem 3.16 The class of regular languages is closed under concatenation.
Proof Suppose that A1 and A2 are regular and let M1 and M2 be DFAs for these
languages. We construct an NFA N that recognizes A1 A2 .
The idea is illustrated in Figure 3.25. We add to the accepting states of M1
" transitions to the start state of M2 . This gives N the option of switching to
M2 every time it enters one of the accepting states of M1 . We also make the
accepting states of M1 non-accepting.
Lets make sure this really works. Suppose that w A1 A2 . That is, w = x y
with x A1 and y A2 . Then after reading x, N will be in one of the accepting
states of M1 . From there, it can use one of the new " transitions to move to the
start state of M2 . The string y will then take N to one of the start states of M2 ,
causing N to accept w.
Conversely, if w is accepted by N , it must be that N uses one of the new
" transitions. This means that w = x y with x A1 and y A2 . Therefore,
L(N ) = A1 A2 .

3.4. CLOSURE PROPERTIES


The formal description of N is left as an exercise.

75
t
u

We end this section by considering another operation on languages. If A is a


language over , then the star of A is the language
A = {x 1 x k | k 0 and each x i }.
That is, A consists of those strings we can form, in all possible ways, by taking
any number of strings from A and concatenating them together.4 Note that "
is always in the star of a language because, by convention, x 1 x k = " when
k = 0.
For example, {0} is the language of all strings that contain only 0s:
{", 0, 00, 000, . . .}. And {0, 1} is the language of all strings that can be formed
with 0s and 1s: {", 0, 1, 00, 01, 10, 11, 000, . . .}. Note that this definition of
the star operation is consistent with our use of to denote the set of all strings
over the alphabet .
Theorem 3.17 The class of regular languages is closed under the star operation.
Proof Suppose that A is regular and let M be a DFA that recognizes this languages. We construct an NFA N that recognizes A .
The idea is illustrated in Figure 3.26. From each of the accepting states of
M , we add " transitions that loop back to the start state.
We also need to ensure that " is accepted by N . One idea is to simply make
the start state of M an accepting state. But this doesnt work, as one of the
exercises asks you to demonstrate. A better idea is to add a new start state with
an " transition to the old start state and make the new start state an accepting
state.
If w = x 1 x k with each x i A, then N can accept w by going through M
k times, each time reading one x i and then returning to the start state of M by
using one of the new " transitions (except after x k ).
4

The language A is also sometimes called the Kleene closure of A.

76

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA


N

"

M
"

"

Figure 3.26: An NFA for the star of a regular language


Conversely, if w is accepted by N , then it must be that either w = " or that
N uses the new " looping back transitions k times, for some number k 0,
breaking w up into x 1 x k+1 , with each x i A. In either case, this implies that
w A . Therefore, L(N ) = A .
The formal description of N is left as an exercise.
u
t

Study Questions
3.4.1. What is the star of a language?

Exercises
3.4.2. Give a formal description of the NFA of the proof of Theorem 3.16.
3.4.3. Give a formal description of the NFA of the proof of Theorem 3.17.
3.4.4. The proof of Theorem 3.17 mentions the idea illustrated in Figure 3.27.
From each of the accepting states of M , we add " transitions that loop back
to the start state. In addition, we make the start state of M accepting to
ensure that " is accepted.

3.4. CLOSURE PROPERTIES

77
"

"

Figure 3.27: A bad idea for an NFA for the star of a regular language
a) Explain where the proof of the theorem would break down if this idea
was used.
b) Show that this idea cannot work by providing an example of a DFA
M for which the NFA N of Figure 3.27 would not recognize L(M ) .
3.4.5. For every language A over alphabet , let
A+ = {x 1 x k | k 1 and each x i }.
Show that the class of regular languages is closed under the plus operation.
3.4.6. Show that a language is regular if and only if it can be recognized by
some NFA with at most one accepting state.
3.4.7. If w = w1 w n is a string of length n, let wR denote the reverse of w,
that is, the string w n w1 . The reverse L R of a language L is defined as
the language of strings wR where w L. Show that the class of regular
languages is closed under reversal, that is, show that if L is regular, then
L R is also regular.

78

CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA

Chapter 4
Regular Expressions
In the previous two chapters, we studied two types of machines that recognize
languages. In this chapter, we approach languages from a different angle: instead of recognizing them, we will describe them. We will learn that regular
expressions allow us to describe languages precisely and, often, concisely. We
will also learn that regular expressions are powerful enough to describe any regular language, that regular expressions can be easily converted to DFAs, and that
it is often easier to write a regular expression than to directly design a DFA or
an NFA. This implies that regular expressions are useful not only for describing
regular languages but also as a tool for obtaining DFAs for these languages.

4.1

Introduction

Many of you are probably already familiar with regular expressions. For example, in the Unix or Linux operating systems, when working at the prompt (on a
console or terminal) we can list all the PDF files in the current directory (folder)
by using the command ls .pdf. The string ls is the name of the command.
(Its short for list.) The string .pdf is a regular expression. In Unix regular

80

CHAPTER 4. REGULAR EXPRESSIONS

expressions, the star ( ) represents any string. So this command is asking for a
list of all the files whose name consists of any string followed by the characters
.pdf.
Another example is rm project1. . This removes all the files associated
with Project 1, that is, all the files that have the name project1 followed by
any extension.
Regular expressions are convenient. They allow us to precisely and concisely
describe many interesting patterns in strings. But note that a regular expression
corresponds to a set of strings, those strings that possess the pattern. Therefore,
regular expressions describe languages.
In the next section, we will define precisely what we mean by a regular expression. Our regular expressions will be a little different from Unix regular
expressions. In the mean time, here are two examples that illustrate both the
style of regular expressions we will use as well as the usefulness of regular expressions.

Example 4.1 Consider the language of valid C++ identifiers. Recall that these
are strings that begin with an underscore or a letter followed by any number of
underscores, letters and digits. Figure 2.6 shows a DFA for this language. That
DFA was not very complicated but a regular expression is even simpler. Let D
stand for any digit and L for any letter. Then the valid C++ identifiers can be
described as follows:
(_ L)(_ L D) .
This expression says that an identifier is an underscore or a letter followed by
any number of underscores, letters and digits. Note how the star is used not to
represent any string but as an operator that essentially means any number of.
We can make the regular expression more precise be defining what we mean
with the symbols L and D. Here too, regular expressions can be used:

4.1. INTRODUCTION

81
L = a b c Z
D = 0 1 2 9
t
u

Example 4.2 Consider the language of correctly formatted phone numbers. As


before, what we mean are strings that consists of 7 digits, or 3 digits followed
by a dash and 4 digits. Figure 2.7 shows a DFA for this language. A regular
expression is simpler:
D7 D3D4 .
An integer i used as an exponent essentially means i times.
Note how much more concise the regular expression is compared to the description in words. That description used 16 words. In addition, the regular
expression allows us to better visualize the strings in the language.
t
u

Study Questions
4.1.1. What are three frequent advantages of regular expressions over descriptions in words (in a natural language such as English)?
4.1.2. What is a potential advantage of regular expressions over DFAs?
4.1.3. What do a star and an integer exponent mean in a regular expression?
4.1.4. What does the union operator () mean in a regular expression?

Exercises
4.1.5. Give regular expressions for the languages of the first four exercises of
Section 2.2. (Note that " can be used in a regular expression.)

82

CHAPTER 4. REGULAR EXPRESSIONS

4.2

Formal Definition

In this section, we formally define what a regular expression is and what a regular expression means, that is, the language that a regular expression describes.
Definition 4.3 We say that R is a regular expression over alphabet if R is of one
of the following forms:
1. R = a, where a .
2. R = ".
3. R = ;.
4. R = (R1 R2 ), where each R i is a regular expression over .
5. R = (R1 R2 ), where each R i is a regular expression over .
6. R = (R1 ), where R1 is a regular expression over .
Note that to be precise, these are fully parenthesized regular expressions.
Parentheses can be omitted by using the following order of precedence for the
operations: , then , then . The concatenation operator is usually omitted:
R1 R2 .
Definition 4.4 Suppose that R is a regular expression over . The language described by R (or the language of R) is defined as follows:
1. L(a) = {a}, if a .
2. L(") = {"}.
3. L(;) = ;.
4. L(R1 R2 ) = L(R1 ) L(R2 ).

4.3. MORE EXAMPLES

83

5. L(R1 R2 ) = L(R1 )L(R2 ).


6. L(R1 ) = L(R1 ) .
These are the basic regular expressions we will use in these notes. When
convenient, we will augment them with the following abbreviations. If =
{a1 , . . . , ak } is an alphabet, then denotes the regular expression a1 ak .
If R is a regular expression, then R+ denotes the regular expression RR , so that
L(R+ ) = L(R)L(R)
= {x 1 x k | k 1 and each x i L(R)}
= L(R)+ .
If R is a regular expression and k is a positive integer, then Rk denotes R concatenated with itself k times.

4.3

More Examples

In this section, we give several additional examples of regular expressions. In


fact, we give regular expressions for almost all the languages of the examples of
Section 2.4. The regular expressions will not be necessarily much simpler than
DFAs or NFAs for these languages, but they will be more concise.
Example 4.5 In all of these examples, the alphabet is = {0, 1}.
1. The regular expression describes the language of all strings over . In
other words, L( ) = .
2. The language of all strings that begin with 1 is described by the regular
expression 1 .
3. The language of all strings that end in 1 is described by 1.

84

CHAPTER 4. REGULAR EXPRESSIONS


4. The language of strings of length at least two that begin and end with the
same symbol is described by 0 0 1 1.
5. The language of strings that contain the substring 001 is described by
001 .
6. The language of strings that contain an even number of 1s is described by
(0 10 1) 0 .
7. The language of strings that contain a number of 1s thats a multiple of k
is described by ((0 1)k ) 0 .
t
u

Example 4.6 If R is a regular expression, then


L(R ;) = L(R)
L(R;) = ;
L(R") = L(R)
In addition, L(" ) = {"} and
L(; ) = ;
= {x 1 x k | k 0 and each x i ;}
= {"}
since, as we said earlier, by convention, x 1 x k = " when k = 0.

t
u

Exercises
4.3.1. Give a regular expression for the language of strings that begin and end
with the same symbol. (Note that the strings may have length one.) The
alphabet is = {0, 1}.

4.4. CONVERTING REGULAR EXPRESSIONS INTO DFAS

85

4.3.2. Give regular expressions for the languages of Exercise 2.4.2.


4.3.3. Give regular expressions for the languages of Exercise 2.4.3.
4.3.4. Give regular expressions for the languages of Exercise 2.4.4.
4.3.5. Give regular expressions for the complement of each of the languages of
Exercise 2.4.2.
4.3.6. Give regular expressions for the languages of Exercise 2.5.4.
4.3.7. Give regular expressions for the languages of Exercise 2.5.5.

4.4

Converting Regular Expressions into DFAs

In this section, we show that the languages that are described by regular expressions are all regular. And the proof of this result will be constructive: it
will provide an algorithm for converting regular expression into NFAs, which
can then be converted into DFAs by using the algorithm of the previous chapter.
Combined with the fact that it is often easier to write a regular expression than
to design a DFA or an NFA, this implies that regular expressions are useful as a
tool for obtaining DFAs for many languages.
Theorem 4.7 If a language is described by a regular expression, then it is regular.
Proof Suppose that R is a regular expression. We construct an NFA N that
recognizes L(R).
The construction is recursive. There are six cases that correspond to the six
cases of Definition 4.3. If R = a, where a , if R = ", or if R = ;, then N is one
of the NFAs shown in Figure 4.1. These are the base cases.
If R = R1 R2 , if R = R1 R2 , or if R = R1 , then first recursively convert R1
and R2 into N1 and N2 and then combine or transform these NFAs into an NFA

86

CHAPTER 4. REGULAR EXPRESSIONS


a

Figure 4.1: NFAs for the languages {a}, {"} and ;


N for L(R) by using the constructions we used to prove the closure results of
Section 3.4.
t
u
Example 4.8 Recall the regular expression for the language of valid C++ identifiers:
(_ L)(_ L D) .
Lets convert this regular expression into an NFA.
Figures 4.2 and 4.3 show the various steps. Step 1 shows the basic NFAs.
Step 2 shows an NFA for _ L. The NFA for _ L D is similar. Step 3 shows
an NFA for (_ L D) . Figure 4.3 shows the final result, the concatenation of
the NFAs from Steps 2 and 3.
Note that we treated L and D as symbols even though they really are unions
of symbols. What this means is that transitions labeled L and D are actually
multiple transitions, one for each letter or digit.
t
u

Exercises
4.4.1. By using the algorithm of this section, convert the following regular expressions into NFAs. In all cases, the alphabet is = {0, 1}.
a) 0 1 .
b) 0 0.
c) 0 10 .
d) (11) .

4.4. CONVERTING REGULAR EXPRESSIONS INTO DFAS

87

1.

2.
"

"

3.
"
"

"

"

"
L

"

"

Figure 4.2: The conversion of the regular expression (_ L)(_ L D) .

88

CHAPTER 4. REGULAR EXPRESSIONS

"

"

"
"

"
"

"

"

"
L

"

"

D
Figure 4.3: An NFA for the regular expression (_ L)(_ L D) .

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS

89

4.4.2. Extend regular expressions by adding intersection and complementation


operators, as in R1 R2 and R1 . Revise the formal definitions of Section 4.2
and then show that these extended regular expressions can only describe
regular languages. (In other words, these extended regular expressions
are no more powerful than the basic ones.)

4.5

Converting DFAs into Regular Expressions

In the previous section, we learned that regular expressions can be converted


into DFAs. And we said that this was useful because it is often easier to write
a regular expression than to design a DFA directly. Compiler-design tools, for
example, take advantage of this fact: they take as input regular expressions that
specify elements of a programming language and then convert those expressions
into DFAs that are eventually turned into code that is incorporated into a compiler. Another example is the standard library of the programming language
JavaScript, which allows programmers to specify input string patterns by using
regular expressions.
One question that then arises is whether this strategy of writing a regular
expression and then converting it into a DFA applies to every regular language.
In other words, are there languages that have DFAs but no regular expression?
If so, we would have to design DFAs directly for those languages.
In this section and the next one, we will show that the answer is no: every
language that has a DFA also has a regular expression. This shows that the
above strategy of writing regular expressions and converting them into DFAs is
not more limited than the alternative of designing DFAs directly.
Once again, the proof of this result will be constructive: it will provide an
algorithm that converts DFAs into equivalent regular expressions. This algorithm can be useful. For example, there are languages for which it is easier to
design a DFA directly than to write a regular expression. If what we needed for
such a language was a regular expression (for example, to use as the input to

90

CHAPTER 4. REGULAR EXPRESSIONS

a compiler-design tool), then it might be easier to design a DFA and convert it


into a regular expression than to write a regular expression directly.
So we want an algorithm that converts DFAs into regular expressions. Where
should we start? Lets start small. Lets start with the smallest possible DFAs and
see how we can convert them into regular expressions.
There are two DFAs that have only one state:

0,1

0,1

To make things concrete, were assuming that the input alphabet is = {0, 1}.
These DFAs can be easily converted into regular expressions: and ;. This was
probably too easy to teach us much about the algorithm. . .
So lets consider a DFA with two states:
c

a
b
d

A regular expression for this DFA can be written by allowing the input string
to leave and return to the start state any number of times, and then go to the
accepting state and stay there: (a bc d) bc .
Heres another two-state DFA:
a

c
b
d

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS

91

In that case, the regular expression is simpler: (a bc d) .


It could also be that both states of a two-state DFA are accepting, or that
neither state is accepting. Those cases are easy to deal with: and ;.
Now, it could be that in these two-state DFAs, some of the transitions are
labeled by multiple symbols, which really means that each of these transitions
actually represents more than one transition. This is easy to handle: simply
view every transition as being labeled by a regular expression thats a union of
symbols. For example,
R3

R1
R2
R4

If a transition is labeled R = a1 ak , then that transition can be used when


reading any one of the symbols a1 , . . . , ak . A regular expression equivalent to
such a DFA can be obtained by using the same reasoning as before. For example,
the regular expression that corresponds to the above DFA is
(R1 (R2 )(R3 ) (R4 )) (R2 )(R3 ) .
It could also be that some of the transitions are missing. We could consider
all the ways in which transitions could be missing and come up with a regular
expression for each case. For example, if the loop at the second state was missing
in the above DFA, then the regular expression would be (R1 (R2 )(R4 )) (R2 ). But
there are 15 different cases. . .
A more convenient way to handle missing transitions is to allow transitions
to be labeled by the regular expression ;. For example, if the loop at the second
state was absent, then we could set R3 = ; and use the regular expression for
the general case. This would give

92

CHAPTER 4. REGULAR EXPRESSIONS


R9
R1

R3
R2

q0

R5
R4

q1
R7

q2
R6

R8

Figure 4.4: A DFA with three states

(R1 (R2 )(R3 ) (R4 )) (R2 )(R3 ) = (R1 (R2 )(R4 )) (R2 ),

which is correct. It is easy to verify that if any of the regular expressions are set
to ;, then the general regular expression (R1 (R2 )(R3 ) (R4 )) (R2 )(R3 ) is still
correct. (The following facts are useful: for every R, R; = ;R = ;, R; = ;R =
R and ; = {"}.)
Lets move on to a DFA with three states, such as the one shown in Figure 4.4.
Coming up with a regular expression for this DFA seems much more complicated
than for DFAs with two states. So heres an idea that may at first seem a little
crazy: what about we try to remove the middle state of the DFA? If this could
be done, then the DFA would have two states and we know how to handle that.
To be able to remove state q1 from the DFA, we need to consider all the ways
in which state q1 can be used to travel through the DFA. For example, state q1
can be used to go from state q0 to state q2 :

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS

93

R3
q0

R2

q1

R4

q2

R8
The DFA does that when it is in state q0 and reads a string w L((R2 )(R3 ) (R4 )).
The eventual removal of state q1 can be compensated for by adding the regular
expression (R2 )(R3 ) (R4 ) to the transition that goes directly from q0 to q2 :
q0

q1

q2

R8 (R2 )(R3 ) (R4 )


Then, instead of going from q0 to q2 through q1 while reading a string w
L((R2 )(R3 ) (R4 )), we can now go directly from q0 to q2 while reading w.
Note that the resulting automaton is no longer a DFA because it now contains
a transition that is labeled by a regular expression thats not just ; or a union of
symbols. But the meaning of this should be pretty clear: if a transition is labeled
R, then that transition can be used while reading any string in L(R). Later, we
will take the time to formally define this new kind of automaton and how it
operates.
Another possibility that needs to be considered is that state q1 can also be
used to travel from state q0 back to state q0 :
R1

R3
R2

q0

q1
R7

94

CHAPTER 4. REGULAR EXPRESSIONS

The DFA does that when it is in state q0 and reads a string in L((R2 )(R3 ) (R7 )).
Once again, the removal of state q1 can be compensated for by expanding the
regular expression that goes directly from q0 back to q0 :
R1 (R2 )(R3 ) (R7 )
q0

q1

To be able to fully remove q1 from the DFA, this compensation operation needs
to be performed for every pair of states other than q1 . Once this is done and q1
is removed, we are left with a two-state automaton of the following form:
R3

R1
R2
R4

We already know how to convert that into a regular expression.


But note that not all DFAs with three states are similar to the DFA of Figure 4.4. For example, state q1 could also be an accepting state. Removing state
q1 would then be problematic.
A solution to this problem is to modify the DFA so it has only one accepting state. This can be done by adding a new accepting state, new " transitions
from the old accepting states to the new one, and removing the accepting status of the old accepting states. For example, suppose that q1 and q2 are both
accepting states in the DFA of Figure 4.4. Then Figure 4.5 shows the result of
this transformation when applied to that DFA. (The original transitions are not
shown.)

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS


q0

q1

q2

95

"

"
Figure 4.5: Ensuring that the DFA of Figure 4.4 has only one accepting state
Note that this transforms the initial DFA into an NFA. But this is not a problem
since it is easy to verify that our state removal idea does not require the finite
automaton to be deterministic.
Once the new accepting state is added, we can remove states q1 and q2 , one
after the other, and end up with an automaton with two states.
In a DFA with three states, it could also be that the start state is an accepting
state. If there are other accepting states in the DFA, then we would add a new
accepting state and the start state would no longer be an accepting state. This
automaton can then be handled as explained above.
If the start state of the DFA is the only accepting state, then we could remove
the other two states to obtain an automaton with a single state:
R1

A regular expression for an automaton of this form is simply (R1 ) .


The only other possibility is that the three-state DFA has no accepting states.
But then the language of this DFA is empty and a regular expression is simply ;.
No need to eliminate states.
So we now know how to handle any DFA with three states. What about
DFAs with four states? It turns out that the state-removal strategy can be used

96

CHAPTER 4. REGULAR EXPRESSIONS

on DFAs with any number of states: remove states one by one until youre left
with at most two. Heres a complete description of the algorithm:
1. If the DFA has no accepting states, output ;.
2. Ensure that the DFA has only one accepting state by adding a new one if
necessary.
3. Remove each state, one by one, except for the start state and the accepting
state. This results in one of the following automata:
R3

R1

R1

R2
R4
4. Output either (R1 (R2 )(R3 ) (R4 )) (R2 )(R3 ) or (R1 ) .
The above description of our algorithm is fairly informal. Before making it
more precise, lets run through a couple of examples.
Example 4.9 Lets convert into a regular expression the NFA of Figure 4.6. This
is an NFA for the language of strings that contain the substring 001. This NFA
already has only one accepting state so there is no need to add anew one. Lets
start by removing state q2 . That state can only be used to go from state q1
to state q3 while reading the string 01. The result is the first automaton of
Figure 4.7. The second automaton in that figure is the result of removing state
q1 . A regular expression can be easily obtained from that second automaton:
(0 1) 001(0 1) .
t
u

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS

0, 1

0, 1
q0

97

q1

q2

q3

Figure 4.6: An NFA for the language of strings that contain the substring 001

1.

01

01
q0

q1

01

q3

2.

01

01
q0

001

q3

Figure 4.7: Converting the NFA of Figure 4.6

98

CHAPTER 4. REGULAR EXPRESSIONS

1
0

0
1
q0

0
1

q1

q2

2
2

Figure 4.8: A DFA for the language of strings whose digits add to either 0 or 1,
modulo 3
Example 4.10 Lets convert the DFA shown in Figure 4.8. The input alphabet
is {0, 1, 2} so that strings can be viewed as consisting of digits. This DFA recognizes the language of strings whose digits add to either 0 or 1, modulo 3. In
other words, the sum of the digits is a multiple of 3, or one greater than a multiple of 3. Since this DFA has two accepting states, we start by adding a new
accepting state. The result is the first automaton of Figure 4.9.
The second automaton of Figure 4.9 is the result of removing state q1 . For
example, state q1 can be used to go from state q0 to state q2 while reading a string
in 10 1. So we add this regular expression to the transition going directly from
state q0 to state q2 .
Figure 4.10 shows the result of removing state q2 . Now, let
R1 = 0 10 2 (2 10 1)(0 20 1) (1 20 2)
and
R2 = " 10 (2 10 1)(0 20 1) 20 .
Then a regular expression equivalent to this last automaton is (R1 ) (R2 ).

t
u

4.5. CONVERTING DFAS INTO REGULAR EXPRESSIONS

1.

1
0

0
q0

0
1

q1

q2

2
"

2
"
2.

0 10 2

0 20 1
1 20 2

q0

q2

2 10 1

20

" 10

Figure 4.9: Converting the DFA of Figure 4.8 (part 1 of 2)

99

100

CHAPTER 4. REGULAR EXPRESSIONS

3.

0 10 2 (2 10 1)(0 20 1) (1 20 2)
q0

" 10 (2 10 1)(0 20 1) 20

Figure 4.10: Converting the DFA of Figure 4.8 (part 2 of 2)

Exercises
4.5.1. By using the algorithm of this section, convert into regular expressions
the NFAs of Figure 4.11.

4.6

Precise Description of the Algorithm

We now give a precise description of the algorithm of the previous section. Such
a description is needed if we want to carefully prove that the algorithm works.
Or if we wanted to code the algorithm.
The key step in the algorithm is the removal of each state other than the
start state and the accepting state. This can be done by using the compensation
operation we described in the previous section. In general, suppose we want to
remove state q1 . Let q2 and q3 be two other states and consider the transitions
that go from q2 to q3 either directly or through q1 , as shown in the first automaton
of Figure 4.12. To compensate for the eventual removal of state q1 , we add the
regular expression (R2 )(R3 ) (R4 ) to the transition that goes directly from q2 to
q3 , as shown in the second automaton of Figure 4.12.
Note that the description of the compensation operation assumes that there
is exactly one transition going from a state to each other state. This is not the

4.6. PRECISE DESCRIPTION OF THE ALGORITHM

101

a)

1
1
q0

0
1

q1

q2

0
b)

0,1

0
0
q0

q1

q2

1
1
Figure 4.11: NFAs for Exercise 4.5.1

102

CHAPTER 4. REGULAR EXPRESSIONS

1.
R3
q2

R2

q1

R4

q3

R1
2.
q2

q1

q3

R1 (R2 )(R3 ) (R4 )

Figure 4.12: The compensation operation

4.6. PRECISE DESCRIPTION OF THE ALGORITHM

103

case in every DFA since there can be no transitions or multiple transitions going
from one state to another. So the first step of the algorithm should actually be
to transform the DFA so it has exactly one transition going from each state to
every other state. This is easy to do. If there is no transition from q1 to q2 , add
one labeled ;. If there are multiple transitions labeled a1 , . . . , ak going from q1
to q2 , replace those transitions by a single transition labeled a1 ak .
In addition, as mentioned earlier, the compensation operation turns the DFA
into an NFA whose transitions are labeled by regular expressions. To be able to
describe precisely this operation, and convince ourselves that it works correctly,
we need to define precisely this new type of NFA and how it operates. Heres
one way of doing it.
Definition 4.11 A generalized nondeterministic finite automaton (GNFA) is a
5-tuple (Q, , , q0 , F ) where
1. Q is a finite set of states.
2. is an alphabet called the input alphabet.
3. : Q Q R is the transition function, where R is the set of all regular
expressions over .
4. q0 Q is the starting state.
5. F Q is the set of accepting states.
The only difference between this definition and that of an ordinary NFA is
the specification of the transition function. In an NFA, the transition function
takes a state and a symbol, or ", and gives us a set of possible next states. In
constrast, in a GNFA, the transition function takes a pair of states and gives us
the regular expression that labels the transition going from the first state to the
second one. Note how this neatly enforces the fact that in a GNFA, there is
exactly one transition going from each state to every other state.

104

CHAPTER 4. REGULAR EXPRESSIONS

We now define how a GNFA operates; that is, we define what it means for a
GNFA to accept a string. The idea is that a string w is accepted if the GNFA reads
a sequence of strings y1 , . . . , yk with the following properties:
1. This sequence of strings takes the GNFA through a sequence of states
r0 , r1 , . . . , r k .
2. This first state r0 is the start state of the GNFA.
3. The last state rk is an accepting state.
4. The reading of each string yi is a valid move, in the sense that yi is in the
language of the regular expression that labels the transition going from
ri1 to ri .
5. The concatenation of all the yi s corresponds to w.
All of this can be said more concisely, and more precisely, as follows:
Definition 4.12 Let N = (Q, , , q0 , F ) be a GNFA and let w be a string of length
n over . Then N accepts w if and only if w can be written as y1 yk , with each
yi , and there is a sequence of states r0 , r1 , . . . , rk such that
r0 = q 0
yi (ri1 , ri ),

for i = 1, . . . , k

rk F.
We now describe precisely the state removal step of the algorithm. Let q1 be
the state to be removed. For every other pair of states q2 and q3 , let
R1 = (q2 , q3 )
R2 = (q2 , q1 )
R3 = (q1 , q1 )
R4 = (q1 , q3 )

4.6. PRECISE DESCRIPTION OF THE ALGORITHM

105

as illustrated by the first automaton of Figure 4.12. Remove the transitions adjacent to q1 . Change to R1 (R2 )(R3 ) (R4 ) the label of the transition that goes
directly from q2 to q3 , as shown in the second automaton of Figure 4.12. That is,
set (q2 , q3 ) to R1 (R2 )(R3 ) (R4 ). Once all pairs q2 and q3 have been considered,
remove state q1 .
And heres a proof that this works:
Lemma 4.13 If the state removal step is applied to GNFA N , then the resulting
GNFA still recognizes L(N ).
Proof Let N 0 be the GNFA that results from removing state q1 from GNFA N .
Suppose that w L(N ). If w can be accepted by N without traveling through
state q1 , then w is accepted by N 0 . Now suppose that to accept w, N must travel
through q1 . Suppose that in one such instance, N reaches q1 from q2 and that it
goes to q3 after leaving q1 , as shown in the first automaton of Figure 4.12. Then
N travels from q2 to q3 by reading strings y1 , . . . , yk such that
y1 L(R2 )
yi L(R3 ),

for i = 2, . . . , k 1

yk L(R4 ).
This implies that y1 yk L((R2 )(R3 ) (R4 )). In N 0 , the transition going directly
from q2 to q3 is now labeled R1 (R2 )(R3 ) (R4 ). This implies that N 0 can move
directly from q2 to q3 by reading y1 yk . And this applies to every instance in
which N 0 goes through q1 while reading w. Therefore, N 0 still accepts w.
Now suppose that w L(N 0 ). If w can be accepted by N 0 without using
one of the relabeled transitions, then w is accepted by N . Now suppose that
to accept w, N 0 must use one of the relabeled transitions. Suppose that in one
such instance, N 0 travels from q2 to q3 on a transition labeled R1 (R2 )(R3 ) (R4 ),
as shown in the second automaton of Figure 4.12. If N 0 goes from q2 to q3 by
reading a string y L(R1 ), then N could have done the same. If instead N 0

106

CHAPTER 4. REGULAR EXPRESSIONS

goes from q2 to q3 by reading a string y L((R2 )(R3 ) (R4 )), then it must be that
y = y1 yk with
y1 L(R2 )
yi L(R3 ),

for i = 2, . . . , k 1

yk L(R4 ).
This implies that N can go from q2 to q3 while reading y as long as it does it
by going through q1 while reading y1 , . . . , yk . This applies to every instance in
which N 0 uses a relabeled transition while reading w. Therefore, N accepts w
and this completes the proof that L(N 0 ) = L(N ).
t
u
So we now know for sure that the state removal step works. As outlined
earlier, this step is repeated on every state except for the start and accepting
states. To ensure that this leaves us with a GNFA that has at most two states, we
first make sure that the initial GNFA has at most one accepting state, by adding
a new one if necessary, as explained earlier.
After the state removal step, we are left with one of the following GNFAs:
R3

R1

R1

R1

R2
R4
The third GNFA occurs only if the initial DFA has no accepting states. As mentioned in the previous section, it is simpler to test for this special case at the very
beginning of the algorithm. This implies that after the state removal step, we
are left with one of the first two GNFAs.
We can simplify the algorithm further by observing that when a new accepting state is added to the initial DFA, then that new accepting state only has ;

4.6. PRECISE DESCRIPTION OF THE ALGORITHM

107

transitions leaving from it. And the compensation operation will not change
that, as can be seen by examining Figure 4.12, because if R1 = ; and R2 = ;,
then R1 (R2 )(R3 ) (R4 ) = ;. Therefore, when a new accepting state is added to
the initial DFA, the two-state GNFA that results from applying the state removal
step will always be of the form
R1
R2

This GNFA is easy to convert into a regular expression: (R1 ) (R2 ).


We are now ready to describe the entire conversion algorithm and prove its
correctness. As mentioned earlier, the algorithm can actually convert arbitrary
NFAs, not just DFAs.
Theorem 4.14 If a language is regular, then it can be described by a regular expression.
Proof Suppose that L is a regular language and that NFA N recognizes L. We
construct a regular expression R that describes L by using the following algorithm:
1. Transform N so that it is a GNFA. This can be done as follows. For every
pair of states q1 and q2 , if there is no transition from q1 to q2 , add one
labeled ;. If there are multiple transitions labeled a1 , . . . , ak going from q1
to q2 , replace those transitions by a single transition labeled a1 ak .
2. If N has no accepting states, set R = ;.
3. Add a new accepting state to N , add new " transitions from the old accepting states to the new one, and remove the accepting status of the old
accepting states.

108

CHAPTER 4. REGULAR EXPRESSIONS

4. One by one, remove every state other than the start state or the accepting
state. This can be done as follows. Let q1 be the state to be removed. For
every other pair of states q2 and q3 , let
R1 = (q2 , q3 )
R2 = (q2 , q1 )
R3 = (q1 , q1 )
R4 = (q1 , q3 )
as illustrated by the first automaton of Figure 4.12. Change to R1
(R2 )(R3 ) (R4 ) the label of the transition that goes directly from q2 to q3 , as
shown in the second automaton of Figure 4.12. That is, set
(q2 , q3 ) = R1 (R2 )(R3 ) (R4 ).
Once all pairs q2 and q3 have been considered, remove state q1 and all
transitions adjacent to it.
5. The resulting GNFA is of the form
R1
R2

Set R = (R1 ) (R2 ).


It is not hard to see that this algorithm is correct, mainly because each transformation of N preserves the fact that N recognizes L. In the case of the state
removal step, this was established by the lemma.
t
u

4.6. PRECISE DESCRIPTION OF THE ALGORITHM

109

Exercises
4.6.1. Give a formal description of the first GNFA of Figure 4.7.
4.6.2. Give a formal description of the second GNFA of Figure 4.9.
4.6.3. By using the algorithm of this section, convert into a regular expression
the NFA of part (b) of Figure 4.11. (The difference with part (b) of Exercise 4.5.1 is that the algorithm of this section always adds a new accepting
state to the DFA.)

110

CHAPTER 4. REGULAR EXPRESSIONS

Chapter 5
Nonregular Languages
In this chapter, we show that not all languages are regular. In other words, we
show that there are languages that cant be recognized by finite automata or described by regular expressions. We will learn to prove that particular languages
are nonregular by using a general result called the Pumping Lemma.

5.1

Some Examples

So far in these notes, we have seen many examples of regular languages. In this
section, we discover our first examples of nonregular languages. In all cases, the
alphabet is {0, 1}.
Example 5.1 Consider the language of strings of the form 0n 1n , where n 0.
Call it L. We know that languages such as {0i 1 j | i, j 2} and {0i 1 j | i j
(mod 3)} are regular. But here we want to determine if i = j. One strategy a
DFA could attempt to use is to count the number of 0s, count the number of 1s
and then verify that the two numbers are the same.
As we have seen before, one way a DFA can count 0s is to move to a different
state every time a 0 is read. For example, if the initial state of the DFA is 0, then

112

CHAPTER 5. NONREGULAR LANGUAGES

0 ji
r0

0i

ri

0n j 1n

Figure 5.1: The computation of M on the string w = 0n 1n


the DFA could go through states 0, 1, 2, . . . , n as it reads n 0s. Now, note that
counting from 0 to n requires n + 1 different states, and that n can be arbitrary
large, but that a DFA only has a finite number of states. If n is greater than or
equal to the number of states of the DFA, then the DFA doesnt have enough
states to implement this strategy.
This argument shows that this particular strategy is not going to lead to a
DFA that recognizes the language L. But to prove that no DFA can recognize this
language, we need a more general argument, one that rules out any possible
strategy.
Such an argument can be constructed by considering, once again, the states
that a DFA goes through as it reads the 0s of a string of the form 0n 1n .
Suppose that DFA M recognizes L. Consider a string w of the form 0n 1n . As
M reads the 0s of w, it goes through a sequence of states r0 , r1 , r2 , . . . , rn . Choose
n to be equal to the number of states of M . Then there must be a repetition in
the sequence.
Suppose that ri = r j with i < j. Then the computation of M on w is as shown
in Figure 5.1. That is, after reading i 0s, the DFA is in state ri . After reading an
additional j i 0s, for a total of j 0s, the DFA returns to state ri . From there,
after reading the rest of w, the DFA reaches an accepting state.
Now, what Figure 5.1 makes clear is that the DFA accepts not only w but also
the string 0i 0n j 1n = 0n( ji) 1n . But this string is not in L because it contains
less 0s than 1s. So we have found a string that is accepted by M but does not

5.1. SOME EXAMPLES

113

0 ji
r0

0i

ri

0n j 1n

Figure 5.2: The computation of M on the string w = 0n 1n


belong to L. This contradicts the fact that M recognizes L and shows that a DFA
that recognizes L cannot exist. Therefore, L is not regular.
t
u
The argument we developed in this example can be used to show that other
languages are not regular.
Example 5.2 Let L be the language of strings of the form 0i 1 j where i j.
Once again, suppose that L is regular. Let M be a DFA that recognizes L and let
n be the number of states of M .
Consider the string w = 0n 1n . As M reads the 0s in w, M goes through a
sequence of states r0 , r1 , r2 , . . . , rn . Because this sequence is of length n+1, there
must be a repetition in the sequence.
Suppose that ri = r j with i < j. Then the computation of M on w is as shown
in Figure 5.2. Once again, this implies that the string 0i 0n j 1n = 0n( ji) 1n is
also accepted. But since the number of 0s has been reduced, this string happens
to be in L. So there is no contradiction. . .
So by skipping the loop, we get a string that is still in the language. Lets
try something else: lets go around the loop twice. This corresponds to reading
the string 0i 02( ji) 0n j 1n = 0n+( ji) 1n and this string is also accepted by M .
However, since this string has more 0s than 1s, it is not in L. This contradicts
the fact that M recognizes L. Therefore, M cannot exist and L is not regular. t
u
What this last example shows is that sometimes going around the loop more
than once is the way to obtain a contradiction and make the argument work.

114

CHAPTER 5. NONREGULAR LANGUAGES

0 ji
r0

0i

ri

0n j 0n

Figure 5.3: The computation of M on the string w = 0n 0n


Heres another example.
Example 5.3 Let L be the language of strings of the form ww. Suppose that L
is regular. Let M be a DFA that recognizes L and let n be the number of states
of M .
Consider the string w = 0n 0n . As M reads the 0s in the first half of w, M
goes through a sequence of states r0 , r1 , r2 , . . . , rn . Because this sequence is of
length n + 1, there must be a repetition in the sequence.
Suppose that ri = r j with i < j. Then the computation of M on w is as shown
in Figure 5.3. Once again, this implies that the string 0i 0n j 0n = 02n( ji) is also
accepted. Unfortunately, we cannot conclude that this string is not in L. In fact,
if j i is even, then this string is in L because the string can still be split in two
equal halves.
To fix this argument, note that if the string is still in L after we delete some
0s from its first half, its because the middle of the string will have shifted to the
right. So we just need to pick a string w whose middle cannot move. . .
Let w = 0n 10n 1. Then the repetition will occur within the first block of 0s,
as shown in Figure 5.4. This implies that the string 0i 0n j 10n 1 = 0n( ji) 10n 1
is also accepted. But this string is clearly not in L, which contradicts the fact
that M recognizes L. Therefore, M cannot exist and L is not regular.
t
u
This last example makes the important point that the string w must be chosen
carefully. In fact, this choice can sometimes be a little tricky.

5.2. THE PUMPING LEMMA

115

0 ji
r0

0i

ri

0n j 10n 1

Figure 5.4: The computation of M on the string w = 0n 10n 1

Exercises
5.1.1. Show that the language {0n 12n | n 0} is not regular.
5.1.2. Show that the language {0i 1 j | 0 i 2 j} is not regular.
5.1.3. If w is a string, let wR denote the reverse of w. That is, if w = w1 w n ,
then wR = w n w1 . Show that the language {wwR | w {0, 1} } is not
regular.

5.2

The Pumping Lemma

In the previous section, we showed that three different languages are not regular. Each example involved its own particular details but a good chunk of the
argument was common to all three. In this section, we will isolate the common
portion of the argument and turn in into a general result about regular languages. In this way, the common portion wont need to be repeated each time
we want to show that a language is not regular.
In all three examples, the proof that L is not regular went as follows. Suppose
that L is regular. Let M be a DFA that recognizes L and let n be the number of
states of M .
Choose a string w L of length at least n. (The choice of w depends on
L.) As M reads the first n symbols of w, M goes through a sequence of states

116

CHAPTER 5. NONREGULAR LANGUAGES


y
r0

ri

Figure 5.5: The computation of M on the string w = 0k 1k


r0 , r1 , r2 , . . . , rn . Because this sequence is of length n + 1, there must be a repetition in the sequence.
Suppose that ri = r j with i < j. Then the computation of M on w is as shown
in Figure 5.5. More precisely, if w = w1 w m , then x = w1 w i , y = w i+1 w j
and z = w j+1 w m .
The fact that the reading of y takes M from state ri back to state ri implies
that the strings xz and x y 2 z are also accepted by M . (In fact, it is also true that
x y k z is accepted by M for every k 0.)
In each of the examples of the previous section, we then observed that one
of the strings xz or x y 2 z does not belong to L. (The details depend on x, y,
z and L.) This contradicts the fact that M recognizes L. Therefore, M cannot
exist and L is not regular.
To summarize, the argument starts with a language L that is assumed to be
regular and a string w L that is long enough. It then proceeds to show that w
can be broken into three pieces, w = x yz, such that x y k z L, for every k 0.
There is more to it, though. In all three examples of the previous section,
we also used the fact that the string y occurs within the first n symbols of w.
This is because the repetition occurs while M is reading the first n symbols of w.
This is useful because it gives us some information about the contents of y. For
example, in the case of the language {0n 1n }, this told us that y contained only
0s, which meant that xz contained less 0s than 1s.
In addition, the above also implicitly uses the fact that y is not empty. Oth-

5.2. THE PUMPING LEMMA

117

erwise, xz would equal w and the fact that xz is accepted by M would not lead
to a contradiction.
So a more complete summary of the above argument is as follows. The argument starts with a language L that is assumed to be regular and a string w L
that is long enough. It then proceeds to show that w can be broken into three
pieces, w = x yz, such that
1. The string y can be pumped, in the sense that x y k z L, for every k 0.
2. The string y occurs towards the beginning of w.
3. The string y is nonempty.
In other words, the above argument proves the following result, which is
called the Pumping Lemma:
Theorem 5.4 (Pumping Lemma) If L is a regular language, then there is a number p > 0, called the pumping length, such that if w L and |w| p, then w can
be written as x yz where
1. |x y| p.
2. y 6= ".
3. x y k z L, for every k 0.
In a moment, we are going to see how the Pumping Lemma can be used to
simplify proofs that languages are not regular. But first, heres a clean write-up
of the proof of the Pumping Lemma.
Proof Let L be a regular and let M be a DFA that recognizes L. Let p be the
number of states of M . Now, suppose that w is any string in L with |w| p.
As M reads the first p symbols of w, M goes through a sequence of states

118

CHAPTER 5. NONREGULAR LANGUAGES

r0 , r1 , r2 , . . . , r p . Because this sequence is of length p + 1, there must be a repetition in the sequence.


Suppose that ri = r j with i < j. Then the computation of M on w is as shown
in Figure 5.5. In other words, if w = w1 w m , let x = w1 w i , y = w i+1 w j
and z = w j+1 w m . Clearly, w = x yz. In addition, |x y| = j p and | y| =
j i > 0, which implies that y 6= ".
Finally, the fact that the reading of y takes M from state ri back to state ri
implies that the string x y k z is accepted by M , and thus belongs to L, for every
k 0.
t
u
We now see how the Pumping Lemma can be used to prove that languages
are not regular.
Example 5.5 Let L be the language {0n 1n | n 0}. Suppose that L is regular.
Let p be the pumping length given by the Pumping Lemma. Consider the string
w = 0 p 1 p . Clearly, w L and |w| p. Therefore, according to the Pumping
Lemma, w can be written as x yz where
1. |x y| p.
2. y 6= ".
3. x y k z L, for every k 0.
Condition (1) implies that y contains only 0s. Condition (2) implies that y
contains at least one 0. Therefore, the string xz cannot belong to L because it
contains less 0s than 1s. This contradicts Condition (3). So it must be that our
initial assumption was wrong: L is not regular.
t
u
It is interesting to compare this proof that {0n 1n } is not regular with the proof
we gave in the first example of the previous section. The new proof is shorter
but, perhaps more importantly, it doesnt need to establish that the string y can
be pumped. Those details are now hidden in the proof of the Pumping Lemma.1
1

Yes, this is very similar to the idea of abstraction in software design.

5.2. THE PUMPING LEMMA

119

Heres another example.


Example 5.6 Let L be the language of strings of the form ww. Suppose that L
is regular. Let p be the pumping length given by the Pumping Lemma. Consider
the string w = 0 p 10 p 1. Clearly, w L and |w| p. Therefore, according to the
Pumping Lemma, w can be written as x yz where
1. |x y| p.
2. y 6= ".
3. x y k z L, for every k 0.
Condition (1) implies that y contains only 0s from the first half of w. Condition (2) implies that y contains at least one 0. Therefore, xz is of the form
0i 10p 1 where i < p. This string is not in L, contradicting Condition (3). This
implies that L is not regular.
t
u
Heres an example for a language of a different flavor.
2

Example 5.7 Let L be the language {1n | n 0}. That is, L consists of strings
of 1s whose length is a perfect square. Suppose that L is regular. Let p be the
2
pumping length. Consider the string w = 1 p . Clearly, w L and |w| p.
Therefore, according to the Pumping Lemma, w can be written as x yz where
1. |x y| p.
2. y 6= ".
3. x y k z L, for every k 0.
2

Consider the string x y 2 z, which is equal to 1 p +| y| . Since | y| 1, for this


string to belong to L, it must be that p2 + | y| (p + 1)2 , the first perfect square
greater than p2 . But | y| p implies that p2 + | y| p2 + p = p(p + 1) < (p + 1)2 .
Therefore, x y 2 z does not belong to L, which contradicts the Pumping Lemma
and shows that L is not regular.
t
u

120

CHAPTER 5. NONREGULAR LANGUAGES

Heres an example that shows that the Pumping Lemma must be used carefully.
Example 5.8 Let L be the language {1n | n is even}. Suppose that L is regular.
Let p be the pumping length. Consider the string w = 12p . Clearly, w L and
|w| p. Therefore, according to the Pumping Lemma, w can be written as x yz
where
1. |x y| p.
2. y 6= ".
3. x y k z L, for every k 0.
Let y = 1. Then xz = 12p1 is of odd length and cannot belong to L. This
contradicts the Pumping Lemma and shows that L is not regular.
Of course, this doesnt make sense because we know that L is regular. So
what did we do wrong in this proof? What we did wrong is that we chose the
value of y. We cant do that. All that we know about y is what the Pumping
Lemma says: w = x yz, |x y| p, y 6= " and x y k z L, for every k 0. This
does not imply that y = 1.
To summarize, when using the Pumping Lemma to prove that a language is
not regular, we are free to choose the string w and the number k. But we cannot
choose the number p or the strings x, y and z.
u
t
One final example.
Example 5.9 Let L be the language of strings that contain an equal number of
0s and 1s. We could show that this language is nonregular by using the same
argument we used for {0n 1n }. However, there is a simpler proof. The key is to
note that
L 0 1 = {0n 1n }.

5.2. THE PUMPING LEMMA

121

If L was regular, then {0n 1n } would also be regular because the intersection of
two regular languages is always regular. But {0n 1n } is not regular, which implies
that L cannot be regular.
t
u
This example shows that closure properties can also be used to show that
languages are not regular. In this case, the fact that L is nonregular became a
direct consequence of the fact that {0n 1n } is nonregular.
Note that it is not correct to say that the fact that {0n 1n } is a subset of L
implies that L is nonregular. Because the same argument would also imply that
{0, 1} is nonregular. In other words, the fact that a language contains a nonregular language does not imply that the larger language is nonregular. What
makes a language is nonregular is not the fact that it contains certain strings:
its the fact that it contains certain strings while excluding certain others.

Exercises
5.2.1. Use the Pumping Lemma to show that the language {0i 1 j | i j} is not
regular.
5.2.2. Use the Pumping Lemma to show that the language {1i #1 j #1i+ j } is not
regular. The alphabet is {1, #}.
5.2.3. Consider the language of strings of the form x # y #z where x, y and z are
strings of digits that, when viewed as numbers, satisfy the equation x +
y = z. For example, the string 123#47#170 is in this language because
123+47 = 170. The alphabet is {0, 1, . . . , 9, #}. Use the Pumping Lemma
to show that this language is not regular.
5.2.4. Let L be the language of strings that start with 0. What is wrong with the
following proof that L is not regular?
Suppose that L is regular. Let p be the pumping length. Suppose
that p = 1. Consider the string w = 01 p . Clearly, w L and

122

CHAPTER 5. NONREGULAR LANGUAGES


|w| p. Therefore, according to the Pumping Lemma, w can be
written as x yz where
1. |x y| p.
2. y =
6 ".
3. x y k z L, for every k 0.
Since |x y| p = 1, it must be that | y| 1. Since y 6= ", it must
be that y = 0. Therefore, xz = 1 p
/ L. This contradicts the
Pumping Lemma and shows that L is not regular.

Chapter 6
Context-Free Languages
In this chapter, we move beyond regular languages. We will learn an extension
of regular expressions called context-free grammars. We will see several examples of languages that can be described by context-free grammars. We will also
discuss algorithms for context-free languages.

6.1

Introduction

We know that the language of strings of the form 0n 1n is not regular. This
implies that no regular expression can describe this language. But here is a way
to describe this language:
S 0S 1
S"
This is called a context-free grammar (CFG). A CFG consists of variables and rules.
This grammar has one variable: S. That variable is also the start variable of the
grammar, as indicated by its placement on the left of the first rule.

124

CHAPTER 6. CONTEXT-FREE LANGUAGES

Each rule specifies that the variable on the left can be replaced by the string
on the right. A grammar is used to derive strings. This is done by beginning with
the start variable and then repeatedly applying rules until all the variables are
gone. For example, this grammar can derive the string 0011 as follows:
S 0S 1 00S 11 0011.
The above is called a derivation. It is a sequence of steps. In each step, a single
rule is applied to one of the variables. For example, the first two steps in this
derivation are applications of the first rule of the grammar. The last step is an
application of the second rule.
The language generated by a grammar, or the language of the grammar, is
the set of strings it can derive. Its is not hard to see that the above grammar
generates the language {0n 1n | n 0}.
Here is another example of a CFG. Consider the language of valid C++ identifiers. Recall that these are strings that begin with an underscore or a letter
followed by any number of underscores, letters and digits. We can think of rules
in a mechanistic way, as specifying what can be done with a variable. But it is often better to think of variables as representing concepts and of rules as defining
the meaning of those concepts.
For example, let I represent a valid identifier. Then the rule I F R says that
an identifier consists of a first character (represented by F ) followed by the rest
of the identifier (R). The rule F _ | L says that the first character is either
an underscore or a letter. The vertical bar (|) can be viewed as an or operator
or as a way to combine multiple rules into one. For example, the above rule is
equivalent to the two rules
F _
FL
Continuing in this way, we get the following grammar for the language of

6.1. INTRODUCTION

125

valid identifiers:
I FR
F _| L
R _R | LR | DR | "
L a | | z | A | | Z
D 0 | | 9
The most interesting rules in this grammar are probably the rules for the variable
R. These rules define the concept R in a recursive way. They illustrate how
repetition can be carried out in a CFG.

Study Questions
6.1.1. What does a CFG consist of?
6.1.2. What is it that appears on the left-hand-side of a rule? On the right-handside?
6.1.3. What is a derivation?
6.1.4. What is the purpose of the vertical bar when it appears on the right-handside of a rule?

Exercises
6.1.5. Consider the grammar for valid identifiers we saw in this section. Show
how the strings x23 and _v2_ can be derived by this grammar. In each
case, give a derivation.
6.1.6. Give CFGs for the languages of the first three exercises of Section 2.2.

126

6.2

CHAPTER 6. CONTEXT-FREE LANGUAGES

Formal Definition of CFGs

In this section, we define formally what a context-free grammar is and the language generated by a context-free grammar.
Definition 6.1 A context-free grammar (CFG) G is a 4-tuple (V, , R, S) where
1. V is a finite set of variables.
2. is a finite set of terminals.
3. R is a finite set of rules of the form A w where A V and w (V ) .
4. S V is the start variable.
It is assumed that V and are disjoint.
Definition 6.2 Suppose that G = (V, , R, S) is a CFG. Consider a string uAv where
A V and u, v (V ) . If A w is a rule, then we say that the string uwv can
be derived (in one step) from uAv and we write uAv uwv.
Definition 6.3 Suppose that G = (V, , R, S) is a CFG. If u, v (V ) , then v
can be derived from u, or G derives v from u, if v = u or if there is a sequence
u1 , u2 , . . . , uk (V ) , for some k 0, such that
u u1 u2 uk v.

In this case, we write u v.


Definition 6.4 Suppose that G = (V, , R, S) is a CFG. Then the language generated by G (or the language of G) is the following set:

L(G) = {w | S w}.
Definition 6.5 A language is context-free if it is generated by some CFG.

6.3. MORE EXAMPLES

6.3

127

More Examples

In this section, we give some additional examples of CFGs.


Example 6.6 In all of the following examples, the alphabet is {0, 1}.
1. The language of all strings is generated by the following grammar:
S 0S | 1S | "
2. The language of strings that start with 1 is generated by the following
grammar:
S 1T
T 0T | 1T | "
3. The language of strings that start and end with the same symbol is generated by the following grammar:
S 0T 0 | 1T 1 | 0 | 1
T 0T | 1T | "
4. The language of strings that contain the substring 001 is generated by the
following grammar:
S T 001 T
T 0T | 1T | "
u
t
The languages in the above example are all regular. In fact, we can show
that all regular languages are context-free.

128

CHAPTER 6. CONTEXT-FREE LANGUAGES

Theorem 6.7 All regular languages are context-free.


Proof The proof is constructive: we give an algorithm that converts any regular
expression R into an equivalent CFG G.
The algorithm has six cases, based on the form of R. If R = a, then G contains
a single rule: S a. If R = ", then G again contains a single rule: S ". If
R = ;, then G simply contains no rules.
The last three cases are when R = R1 R2 , R = R1 R2 and R = (R1 ) . These
cases are recursive. First, convert R1 and R2 into grammars G1 and G2 , making
sure that the two grammars have no variables in common. (Rename variables if
needed.) Let S1 and S2 be the start variables of G1 and G2 , respectively. Then,
for the case R = R1 R2 , G contains all the variables and rules of G1 and G2
plus a new start variable S and the following rule: S S1 | S2 . The other cases
are similar except that the extra rule is replaced by S S1 S2 and S S1 S | ",
respectively.
u
t
Example 6.8 Over the alphabet of parentheses, {(, )}, consider the language of
strings that are properly nested. Given our background in computer science and
mathematics, we should all have a pretty clear intuitive understanding of what
these strings are.1 But few of us have probably thought of a precise definition,
or of the need for one. But a precise definition is needed, to build compilers, for
example.
A precise definition can be obtained as follows. Suppose that w is a string of
properly nested parentheses. Then the first symbol of w must be a left parenthesis that matches a right parenthesis that occurs later in the string. Between
these two matching parentheses, all the parentheses should be properly nested
(among themselves). And to the right of the right parenthesis that matches
the first parenthesis of w, all the parentheses should also be properly nested. In
other words, w should be of the form (u) v where u and v are strings of properly
nested parentheses. Note that either u or v may be empty.
1

To paraphrase one of the most famous phrases in the history of the U.S. Supreme Court, We
know one when we see it.

6.3. MORE EXAMPLES

129

The above is the core of a recursive definition. We also need a base case. That
is, we need to define the shortest possible strings of properly nested parentheses.
Lets say that its the empty string. (An alternative would be the string ().)
Putting it all together, we get that a string of properly nested parentheses
is either " or a string of the form (u) v where u and v are strings of properly
nested parentheses.
A CFG that derives these strings is easy to obtain:
S (S )S | "
It should be clear that this grammar is correct because it simply paraphrases the
definition. (A formal proof would proceed by induction on the length of a string,
for one direction, and on the length of a derivation, for the other.)
u
t

Exercises
6.3.1. Give CFGs for the following languages. In all cases, the alphabet is
{0, 1}.
a) The language {0n 10n | n 0}.
b) The language of strings of the form wwR .
c) The language {0n 12n | n 0}.
d) The language {0i 1 j | i j}.
6.3.2. Give a CFG for the language {1i #1 j #1i+ j }. The alphabet is {1, #}.
6.3.3. Consider the language of properly nested strings of parentheses, square
brackets ([, ]) and braces ({, }). Give a precise definition and a CFG for
this language.
6.3.4. Suppose that we no longer consider that " is a string of properly nested
parentheses. In other words, we now consider that the string () is the

130

CHAPTER 6. CONTEXT-FREE LANGUAGES


shortest possible string of properly nested parentheses. Give a revised
definition and CFG for the language of properly nested parentheses.

6.4

Ambiguity and Parse Trees

It is easy to show that the language of properly nested parentheses is not regular. Therefore, the last example of the previous section shows that CFGs can
express an aspect of programming languages that cannot be expressed by regular expressions. In this section, we consider another such example: arithmetic
expressions. This will lead us to consider parse trees and the concept of ambiguity.
Consider the language of valid arithmetic expressions that consist of the
operand a, the operators + and , as well as parentheses. For example, a+a
a and (a+a) a are valid expressions. As these examples show, expressions
do not have to be fully parenthesized.
This language can be defined recursively as follows: a is an expression and
if e1 and e2 are expressions, then e1 +e2 , e1 e2 and (e1 ) are expressions.
As was the case for strings of properly nested parentheses, a CFG can be
obtained directly from this recursive definition:
E E+E | E E | (E) | a
For example, in this grammar, the string (a+a) a can be derived as follows:

E E E ( E ) E ( E + E ) E (a + a) a
A derivation can be represented by a parse tree in which every node is either
a variable or a terminal. The leaves of the tree are labeled by terminals. The
non-leaf nodes are labeled by variables. If a node is labeled by variable A, then
its children are the symbols on the right-hand-side of one of the rules for A.
For example, the above derivation corresponds to the parse tree shown in
Figure 6.1. This parse tree is useful because it helps up visualize the derivation.

6.4. AMBIGUITY AND PARSE TREES

131
E

Figure 6.1: A parse tree for the string (a+a) a


But in the case of arithmetic expressions, the parse tree also indicates how the
expression should be evaluated: once the values of the operands are known, the
expression can be evaluated by moving up the tree from the leaves. Note that
interpreted in this way, the parse tree of Figure 6.1 does correspond to the correct
evaluation of the expression (a+a) a. In addition, this parse tree is unique: in
this grammar, there is only one way in which the expression (a+a) a can be
derived (and evaluated).
In contrast, consider the expression a+a a. This expression has two different parse trees, as shown in Figure 6.2. Each of these parse trees corresponds to
a different way of deriving, and evaluating, the expression. But only the parse
tree on the right corresponds to the correct way of evaluating this expression
because it follows the rule that multiplication has precedence over addition.
In general, a parse tree is viewed as assigning meaning to a string. When
a string has more than one parse tree in a particular grammar, then the gram-

132

CHAPTER 6. CONTEXT-FREE LANGUAGES


E

Figure 6.2: Parse trees for the string a+a a


mar is said to be ambiguous. The above grammar for arithmetic expressions is
ambiguous.
In practical applications, we typically want unambiguous grammars. In the
case of arithmetic expressions, an unambiguous CFG can be designed by creating
levels of expressions: expressions, terms, factors, operands. Heres one way of
doing this:
E E+T | T
T T F | F
F (E) | a
In this grammar, the string a+a a has only one parse tree, the one shown in
Figure 6.3. And this parse tree does correspond to the correct evaluation of the
expression.
It is not hard to see that this last grammar is unambiguous and that it correctly enforces precedence rules as well as left associativity. Note, however, that
some context-free languages are inherently ambiguous, in the sense that they can
only be generated by ambiguous CFGs.

6.4. AMBIGUITY AND PARSE TREES

133

Figure 6.3: Parse tree for the string a+a a

Study Questions
6.4.1. What is a parse tree?
6.4.2. What does it mean for a grammar to be ambiguous?
6.4.3. What does it mean for a CFL to be inherently ambiguous?

Exercises
6.4.4. Consider the expression a a+a. Give two different parse trees for this
string in the first grammar of this section. Then give the unique parse tree
for this string in the second grammar of this section.

134

CHAPTER 6. CONTEXT-FREE LANGUAGES

Chapter 7
Non Context-Free Languages
Earlier in these notes, we showed that some languages are not regular. We now
know that some of these nonregular languages, such as {0n 1n | n 0}, are
context-free. In this chapter, we will show that some languages that are not
even context-free. As in the case of regular languages, our main tool will be a
Pumping Lemma.

7.1

The Basic Idea

In Chapter 5, we proved that several languages are not regular. Some proofs
were done directly while others used the Pumping Lemma of Section 5.2. Either
way, the key idea was the same: when a DFA reads a string that is long enough,
then the sequence of states that the DFA goes through contains a repetition, that
is, a state that occurs more than once.
It makes sense to try something similar with CFLs. When a CFG derives a
long string, it likely needs to use a long derivation. Since the number of variables
is finite, if the derivation is long enough, some variable will have to occur more
than once in the derivation.

136

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

Lets make this more precise. Let G by a CFG and let b be the maximum number of terminals on the right-hand side of any rule. Any derivation of length k
derives a string of length at most bk. Therefore, if |w| > bk, then any derivation
of w is of length greater than k.
Now, a derivation of length r involves r + 1 strings and all these strings,
except for the last one, contain at least one variable. Therefore, a derivation of
length r contains at least r occurrences of variables. If r > |V |, then some of the
variables will be repeated.
By combining the conclusions of the last two paragraphs, we get that if
|w| > b|V |, then any derivation of w is of length greater than |V | and therefore
guaranteed to contain a repeated variable. Such a derivation has the following
form:

S u1 Au2 v1 Av2 w.
There are two ways in which this repetition can occur. The first type of
repetition is when the second A is derived from the first one. The second type of
repetition is when the second A is derived from either u1 or u2 .
The first type of repetition can be graphically represented by the first tree
shown in Figure 7.1, where x 1 y1 Ay2 x 2 = v1 Av2 and uv x yz = w. The second
tree in that figure represents the same derivation but without showing the intermediate strings. (Note that, strictly speaking, these trees are not parse trees
but high-level outlines of parse trees.)
In the case of regular languages, the repetition of a state in the reading of
w allowed us to pump w. More precisely, w could be written as x yz such that
for every k 0, x y k z was accepted by the DFA. Does the type of repetition
illustrated in Figure 7.1 allow us to pump w? The answer is yes. The portion
of the parse tree that derives vAy from the first A can be omitted or repeated
as illustrated in Figure 7.2. In fact, for every k 0, the string uv k x y k z can be
derived by the grammar.
Can this form of pumping be used to prove that certain languages are not
context-free? Lets make things more concrete by focusing on a specific language.

7.1. THE BASIC IDEA

137
S

u1

u2

x1

y1

y2

x2

Figure 7.1: The first type of repetition

Figure 7.2: Pumping in the first type of repetition

138

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

We know that {0n 1n | n 0} is context-free. A grammar for this language


generates the 0s and 1s in pairs, starting from the outside and progressing
towards the middle of the string:
S 0S 1 | "
Now consider the language L = {an bn cn | n 0}. One idea would be a rule such
as S aS bS c but that would mix the different kinds of letters. Another idea
would be S a T1 b T2 c where T1 generates {an bn | n 0} and T2 generates
{bn cn | n 0}, But that would not ensure equal numbers of as and cs. So L
seems like a good candidate for a non CFL.
Suppose, for the sake of contradiction, that L is context-free and that G is
a CFG that generates L. Let w = an bn cn where n = b|V | + 1. By the previous
discussion, any derivation of w in G contains a repeated variable. Assume that
the repetition is of the type illustrated in Figure 7.1. Then uv k x y k z L for every
k 0. There are two cases to consider.
First, suppose that either v or y contains more than one type of symbol. Then
2
uv x y 2 z
/ L because that string is not even in a b c .
Second, suppose that v and y each contain only one type of symbol. Then,
assuming that at least one of v or y is not empty, uv 2 x y 2 z contains additional
occurrences of at least one type symbol but not of all three types. Therefore,
uv 2 x y 2 z
/ L.
In both cases, we have that uv 2 x y 2 z
/ L. This is a contradiction and proves
that L is not context-free.
The above proof that L is not context-free relies on two (unproven) assumptions. First, we assumed that the repetition was of the first type. Second, we
assumed that at least one of v or y was not empty.
Lets first deal with the second assumption. Suppose that v and y are both
empty. This implies that the string derived in the first tree of Figure 7.2 is equal
to w. Therefore, this tree represents a derivation of w thats smaller than the
original one (the second tree of Figure 7.1). We can avoid this situation by

7.2. A PUMPING LEMMA

139

simply choosing to focus on the smallest possible derivation of w. This ensures


that v and y cannot both be empty.
We will deal with the other assumption in the next section.

Exercises
7.1.1. Under the assumption that repetitions in derivations are always of the
type shown in Figure 7.1, show that the language {ai b j ck | 0 i j k}
is not context-free.

7.2

A Pumping Lemma

In the previous section, we showed that the language L = {an bn cn | n 0} is


not context-free under the assumption that the repetition was of the first type.
Recall that the second type of repetition is when the second A is derived from
either u1 or u2 in a derivation of the form

S u1 Au2 v1 Av2 w.
The case where the second A is derived from u2 is illustrated by the first tree
shown in Figure 7.3, where x 1 y1 z1 Az2 = v1 Av2 and uv x yz = w. The second tree
in that figure represents the same derivation but without showing the intermediate strings.
This type of repetition may not allow us to pump w. But it allows us to do
something: we can switch the subtrees rooted at A, as illustrated in Figure 7.4.
This gives us a derivation of the string u y x vz.
Recalling the proof that L is not context-free, if v contains an a and y contains
a b, then the string u y x vz
/ L because that string is not even in a b c . This
is a contradiction. But if v = y, then switching those two strings would give
u y x vz = w L. No contradiction. This is bad news.

140

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

u1

x1

y1

z1

z2

u2

Figure 7.3: The second type of repetition

Figure 7.4: Switching in the second type of repetition

7.2. A PUMPING LEMMA

141

One way to salvage all this is to show that if w is long enough, then every
derivation of w will always contain a repetition of type 1. To prove this, lets try
a change of perspective: lets examine the relationship between the length of w
and the dimensions of a parse tree for w.
Redefine b to be the maximum number of symbols (instead of terminals) on
the right-hand side of any rule. Then any parse tree of height k derives a string
of length at most b k . Therefore, if |w| > b k , then the height of any parse tree for
w is greater than k.
In a parse tree of height h, at least one path from the root to a leaf is of length
h. On that path, there are h variables and one terminal. If h > |V |, then some
variable will be repeated on this path. And this is what we want: its a repetition
of type 1.
Therefore, if |w| > b|V | , then the height of any parse tree for w is greater than
|V | and the parse tree is guaranteed to contain a repetition of type 1. As we saw
in the last section, this implies that w can be pumped.
This was the last piece of the puzzle. We now have a complete proof of the
following Pumping Lemma:1
Theorem 7.1 (First Pumping Lemma) If L is a CFL, then there is a number p >
0, called the pumping length, such that if w L and |w| p, then w can be written
as uv x yz where
1. v y 6= ".
2. uv k x y k z L, for every k 0.
Heres a clean write-up of the proof:
1

This pumping lemma is being called the First Pumping Lemma because we will prove a
second, stronger version in the next section. When people refer to the Pumping Lemma for
CFLs, it is usually this stronger version that they have in mind.

142

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

Proof Let L be a CFL and let G be a CFG that generates L. Let b be the maximum
number of symbols on the right-hand side of any rule of G. Let p = b|V | +1. Now
suppose that w is any string in L with |w| p.
Let be one of the smallest possible parse trees for w. Because |w| > b|V | ,
the height of must be greater than |V |. This implies that contains a path of
length greater than |V | and on this path, a variable must repeat. Therefore, is
of the form shown in Figure 7.1.
Define u, v, x, y and z as suggested by that figure. Then w = uv x yz and
by omitting or repeating the portion of the tree that derives vAy from the first
A, a parse tree can be constructed for every string of the form uv k x y k z L
with k 0. This is illustrated in Figure 7.2. In addition, v y 6= ", otherwise the
first parse tree of Figure 7.2 would be a parse tree for w thats smaller than ,
contradicting the way we chose .
t
u
As an example, lets use this pumping lemma to show, once again, that the
language {an bn cn | n 0} is not context-free.
Example 7.2 Suppose that L = {an bn cn | n 0} is context-free. Let p be the
pumping length. Consider the string w = a p b p c p . Clearly, w L and |w| p.
Therefore, according to the Pumping Lemma, w can be written as uv x yz where
1. v y 6= ".
2. uv k x y k z L, for every k 0.
There are two cases to consider. First, suppose that either v or y contains
more than one type of symbol. Then uv 2 x y 2 z
/ L because that string is not even

in a b c .
Second, suppose that v and y each contain only one type of symbol. Then
2
uv x y 2 z contains additional occurrences of at least one type symbol but not of
all three types. Therefore, uv 2 x y 2 z
/ L.
2
In both cases, we have that uv x y 2 z
/ L. This is a contradiction and proves
that L is not context-free.
t
u

7.3. A STRONGER PUMPING LEMMA

143

Exercises
7.2.1. Use the Pumping Lemma to show that the language {0n 1n 0n | n 0} is
not context-free.
7.2.2. Use the Pumping Lemma to show that the language {ai b j ck | i j k}
is not context-free. The alphabet is {a, b, c}.
7.2.3. Use the Pumping Lemma to show that the language {1i #1 j #1i+ j | i j}
is not context-free. The alphabet is {1, #}.

7.3

A Stronger Pumping Lemma

In the previous section, we proved a pumping lemma for CFLs. As shown in the
example and exercises of that section, this pumping lemma is useful. But it has
limitations.
For example, let L be the language of strings of the form ww. We know
that the language of strings of the form wwR is context-free. The idea is to
generate the symbols of wwR starting from the outside and progressing towards
the middle:
S 0S 0 | 1S 1 | "
But this idea does not work with strings of the form ww because first symbols of
the string must match symbols located in its middle.
So lets try to use the Pumping Lemma of the previous section to show that
L is not context-free. As usual, we assume that L is context-free and let p be the
pumping length. Let w = 0 p 10 p 1, the same string we used earlier in these notes
to show that L is not regular. Clearly, w L and |w| p. Therefore, according
to the Pumping Lemma, w can be written as uv x yz where
1. v y 6= ".

144

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

2. uv k x y k z L, for every k 0.
And now we have a problem: it turns out that w can be pumped. If u = x =
z = ", v = y = 0 p 1, then uv k x y k z = (0 p 1)k (0 p 1)k L, for every k 0. So we
cant get a contradiction.
One way to solve this problem is to notice that the Pumping Lemma for
regular languages said something that has no equivalent in our First Pumping
Lemma for CFLs: it said that |x y| p. This meant that the pumped string y
was located near the beginning of w.
Can we say something similar about v and y in the case of CFLs? Recall that
v and y are defined in terms of a parse tree, as shown in Figure 7.1. Its not clear
how we could show that v and y are located towards either side of the parse
tree. . .
But we can show that v and y are close to each other. Heres how. Recall
that the repeated variable A occurred on a path of length greater than |V |.
Instead of focusing on any repeated variable that occurs in , lets focus instead
on a repetition that occurs among the bottom |V | + 1 variables of . We know
that there must be one.
Next, consider the subtree rooted at the top A. What is the height of ?
Let 1 be the portion of that leads from the top S to the top A. Let 2 be
the portion of that leads from the top A to a leaf. The length of 2 is at most
|V | + 1. If 2 was the longest path in , then the height of would be at most
|V | + 1. And this would imply that |v x y| b|V |+1 .
How can we ensure that 2 is the longest path in ? The answer is, by
choosing to be one of the longest possible paths in the whole tree, not just any
path of length greater than |V |. Heres why. Suppose that we choose that way
and that contains some other path that is longer than 2 . Then the path
1 would be longer than 1 2 = . That would be a contradiction.
We now have all we need to prove the following stronger version of the
Pumping Lemma for CFLs:
Theorem 7.3 (Pumping Lemma) If L is a CFL, then there is a number p > 0,

7.3. A STRONGER PUMPING LEMMA

145

called the pumping length, such that if w L and |w| p, then w can be written
as uv x yz where
1. |v x y| p.
2. v y 6= ".
3. uv k x y k z L, for every k 0.
Heres a clean write-up of the entire proof:
Proof Let L be a CFL and let G be a CFG that generates L. Let b be the
maximum number of symbols on the right-hand side of any rule of G. Let p =
max(b|V | + 1, b|V |+1 ). (If b 2, then p = b|V |+1 .) Now suppose that w is any
string in L with |w| p.
Let be one of the smallest possible parse trees for w. Because |w| > b|V | , the
height of must be greater than |V |. Let to be one of the longest possible paths
in . Then the length of is greater than |V |, which implies that contains
a repetition among its bottom |V | + 1 variables. Let A be one such repeated
variable. Then is of the form shown in Figure 7.1. In addition, the height of
the subtree rooted at the top A is at most |V | + 1.
Define u, v, x, y and z as suggested by Figure 7.1. Then w = uv x yz and
by omitting or repeating the portion of the tree that derives vAy from the first
A, a parse tree can be constructed for every string of the form uv k x y k z L
with k 0. This is illustrated in Figure 7.2. In addition, v y 6= ", otherwise the
first parse tree of Figure 7.2 would be a parse tree for w thats smaller than ,
contradicting the way we chose . Finally, the fact that the height of is at most
|V | + 1 implies that |v x y| b|V |+1 p.
t
u
Lets use this pumping lemma to show that the language of strings of the
form ww is not context-free.

146

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

Example 7.4 Let L be the language of strings of the form ww. Suppose that L is
context-free. Let p be the pumping length. Consider the string w = 0 p 1 p 0 p 1 p .
Clearly, w L and |w| p. Therefore, according to the Pumping Lemma, w can
be written as uv x yz where
1. |v x y| p.
2. v y 6= ".
3. uv k x y k z L, for every k 0.
The string w consists of four blocks of length p. Since |v x y| p, v and y
can touch at most two blocks. Suppose that v and y are both contained within
a single block, the first one, for example. Then uv 2 x y 2 z = 0 p+i 1 p 0 p 1 p , where
i > 1. This string is clearly not in L. The same is true for the other blocks.
Now suppose that v and y touch two consecutive blocks, the first two, for
example. Then uv 0 x y 0 z = 0i 1 j 0 p 1 p where 0 < i, j < p. Once again, this string
is clearly not in L. The same is true for the other blocks.
Therefore, in all cases, we have that w cannot be pumped. This contradicts
the Pumping Lemma and proves that L is not context-free.
t
u

Exercises
7.3.1. Consider the language of strings of the form wwR that also contain equal
numbers of 0s and 1s. For example, the string 0110 is in this language
but 011110 is not. Use the Pumping Lemma to show that this language
is not context-free.
7.3.2. Use the Pumping Lemma to show that the language {1i #1 j #1i j } is not
context-free. The alphabet is {1, #}. (Recall that Exercises 5.2.2 and 6.3.2
asked you to show that the similar language for unary addition is contextfree but not regular. Therefore, when the problems are presented in this

7.3. A STRONGER PUMPING LEMMA

147

way, unary addition is context-free but not regular, while unary multiplication is not even context-free.)

148

CHAPTER 7. NON CONTEXT-FREE LANGUAGES

Chapter 8
More on Context-Free Languages
8.1

Closure Properties

The constructions in the proof of Theorem 6.7 imply the following:


Theorem 8.1 The class of context-free languages is closed under union, concatenation and the star operation.
However:
Theorem 8.2 The class of context-free languages is not closed under intersection.
Proof Consider the language L1 = {an bn ci | i, n 0}. This language is contextfree:
S TU
T aT b | "
U cU | "

150

CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES

Now, consider the language L2 = {ai bn cn | i, n 0}. This language too is


context-free:
S UT
T bT c | "
U aU | "
But L1 L2 = {an bn cn | n 0} and we now know that this language is not
context-free. Therefore, the class of CFLs is not closed under intersection.
u
t
Corollary 8.3 The class of context-free languages is not closed under complementation.
Proof If it was, then it would be closed under intersection because of De Morgans Law: A B = A B.

t
u

What about a concrete example of a CFL whose complement is not contextfree? Heres one. We know that {an bn cn | n 0} is not context-free. The
complement of this language is

a b c {ai b j ck | i 6= j or j 6= k}.
Its not hard to show that this language is context-free (see the exercises).
Heres another example. We know that the language of strings of the form
ww is not context-free. The complement of this language is
{w {0, 1} | |w| is odd} {x y | x, y {0, 1} , |x| = | y| but x 6= y}
This language is also context-free (see the exercises).

8.2. PUSHDOWN AUTOMATA

151

Exercises
8.1.1. Give a CFG for the language {ai b j ck | i 6= j or j 6= k}. Show that this
implies that the complement of {an bn cn | n 0} is context-free.
8.1.2. Give a CFG for the language {x y | x, y {0, 1} , |x| = | y| but x 6= y}.
Show that this implies that the complement of the language of strings of
the form ww is context-free.
8.1.3. Give a CFG for the language {x # y | x, y {0, 1} and x 6= y}. Use this
to construct another example of a CFL whose complement is not contextfree.

8.2

Pushdown Automata

The class of CFLs is not closed under intersection but it is closed under intersection with a regular language:
Theorem 8.4 If L is context-free and B is regular, then L B is context-free.
Its unclear how this could be proven using CFGs. How can a CFG be combined with a regular expression to produce another CFG?
Instead, recall that for regular languages, closure under intersection was
proved by using DFAs and the pair construction. To carry out a similar proof
for the intersection of a CFL and a regular language, we would need to identify
a class of automata that corresponds exactly to CFLs.
The key to use nondeterminism. Suppose that G is a CFG with start variable S. Figure 8.1 shows a nondeterministic algorithm that simulates G. The
algorithm uses a stack to store variables and terminals.

152

CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES

push S on the stack


while (the stack is not empty)
if (the top of stack is a variable)
let A be that variable
pop A from the stack
nondeterministically choose a rule for A
push the righthand side of the rule on the
stack (with the left end at the top)
else // the top of stack is a terminal
if (end of input) reject
read next char c
if (the top of stack equals c)
pop the stack
else
reject
if (end of input)
accept
else
reject
Figure 8.1: An algorithm that simulates a CFG

8.2. PUSHDOWN AUTOMATA

153

push E on the stack


while (the stack not empty)
if (the top of stack is E)
replace E by either E+T or T (in the case of
E+T, put E at the top of the stack)
else if (the top of stack is T)
replace T by either T F or F
else if (the top of stack is F)
replace F by either (E) or a
else // the top of stack is a, +, , ( or )
if (end of input) reject
read next char c
if (c equals the top of stack)
pop the stack
else
reject
if (end of input)
accept
else
reject
Figure 8.2: An algorithm that simulates a CFG
For example, consider the following grammar:
E E+T | T
T T F | F
F (E) | a
Figure 8.2 shows the algorithm that simulates this grammar.
This kind of algorithm can be described as a single-scan, nondeterministic
stack algorithm. We wont do it in these notes but these algorithms can be for-

154

CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES

malized as a type of automaton called a pushdown automaton (PDA). PDAs are


essentially NFAs augmented with a stack.
It can be shown that PDAs are equivalent to CFGs: every CFG can be simulated by a PDA, and every PDA has an equivalent CFG. Therefore, PDAs characterize CFLs just like DFAs and NFAs characterize regular languages.
Theorem 8.5 A language is context-free if and only if it is recognized by some PDA.
In addition, as was the case with regular languages, the proof of this theorem
is constructive: there are algorithms for converting CFGs into PDAs, and vice
versa.
We can now sketch the proof that the class of CFLs is closed under intersection with a regular language. Suppose that L is context-free and B is regular. Let
P be a PDA for L and let M be a DFA for B. The pair construction can be used
to combine the NFA of P with M . This results in a new PDA that recognizes
L B.
Note that the above proof sketch does not work for the intersection of two
CFLs. The problem is that the resulting machine would have two stacks. Some
of the exercises of this section ask you to show (informally) that adding a second
stack to PDAs produces a class of automata that can recognize languages that
are not context-free.

Study Questions
8.2.1. What is a PDA?

Exercises
8.2.2. In the style of Figure 8.2, give a single-scan, nondeterministic stack algorithm for the second grammar of Example 6.6.

8.3. DETERMINISTIC ALGORITHMS FOR CFLS

155

8.2.3. Describe a single-scan stack algorithm that recognizes the language


{0n 1n | n 0}. (Note: nondeterminism is not needed for this language.)
8.2.4. Describe a single-scan, nondeterministic stack algorithm that recognizes
the language of strings of the form wwR .
8.2.5. Describe a two-stack, single-scan algorithm that recognizes the language
{an bn cn | n 0}. (Note: nondeterminism is not needed for this language.)
8.2.6. Describe a two-stack, single-scan, nondeterministic algorithm that recognizes the language of strings of the form ww.

8.3

Deterministic Algorithms for CFLs

In the previous section, we saw that every CFL can be recognized by a singlescan, nondeterministic stack algorithm. We also said that these algorithms can
be formalized as PDAs. In this section, we will briefly discuss the design of
deterministic algorithms for CFLs. Our discussion will be informal.
To be more precise, we are interested in showing that for every CFG G, there
is an algorithm that given a string of terminals w, determines if G derives w. A
derivation is simply a sequence of rules. Since G only has a finite number of
rules, derivations can be generated just like strings, starting with the shortest
possible derivation. So a simple idea for an algorithm is to generate every possible sequence of rules and determine if any of them constitutes a valid derivation
of w.
If w L(G), then this algorithm will eventually find a derivation of w. But
if w
/ L(G), then no derivation will ever be found and the algorithm may go on
forever.
A way to fix this problem would be to know when to stop searching: to know
a length r, maybe dependent on the length of w, with the property that if w can

156

CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES

be derived, then it has at least one derivation of length no greater than r. A


difficulty in finding such a number r is that " rules may lead to derivations that
introduce large numbers of variables that are later deleted. In addition, " rules
and unit rules (rules that have a single variable on the right-hand side) can lead
to cycles in which rules are used to derive a variable from itself.
Fortunately, it turns out that it is always possible to eliminate " rules and
unit rules from a CFG. In fact, every CFG can always be transformed into an
equivalent CFG thats in Chomsky Normal Form:
Definition 8.6 A CFG G = (V, , R, S) is in Chomsky Normal Form (CNF) if every
one of its rules is in one of the following three forms:
A BC, where A, B, C V and B, C 6= S
A a, where A V and a
S"
Theorem 8.7 Every CFG has an equivalent CFG in CNF.
We wont prove this theorem in these notes but note that the theorem can
be proved constructively: there is an algorithm that given a CFG G, produces an
equivalent CFG G 0 in CNF.
CFGs in CNF have the following useful property:
Theorem 8.8 If G is a CFG in CNF and if w is a string of terminals of length n > 0,
then every derivation of w in G as length 2n 1.
Proof Suppose that D is a derivation of w. Since w is of length n, D must
contain exactly n instances of rules of the second form. Each rule of the first
form introduces one new variable to the derivation. Since S cant appear on
the right-hand-side of those rules, D must contain exactly n 1 rules of the first
form.
t
u

8.3. DETERMINISTIC ALGORITHMS FOR CFLS

157

Therefore, the number we were looking for is 2n 1, where n = |w|. This


gives the following algorithm for any CFG G in CNF:
1. Let w be the input and let n = |w|.
2. Generate every possible derivation of length 2n 1 and determine if any
of them derives w.
Since every CFG has an equivalent CFG in CNF, this shows that every CFG has
an algorithm.
The above algorithm is correct but very inefficient. For example, consider
any CFL in which the number of strings of length n is 2(n) . Each of those strings
has a different derivation. Therefore, given a string of length n, in the worst
case, the algorithm will need to examine 2(n) derivations.
A much faster algorithm can be designed by using the powerful technique of
dynamic programming.1 This algorithm runs in time (n3 ).
For many practical applications, such as the design of compilers, this is still
too slow. In these applications, it is typical to focus on deterministic context-free
languages (DCFLs). DCFLs are the languages that are recognized by deterministic PDAs (DPDAs). In other words, DCFLs are CFLs for which nondeterminism
is not needed.
Note that every regular language is a DCFL because DPDAs can simulate
DFAs. But not all CFLs are DCFLs. And its not hard to see why. The class of
DCFLs is closed under complementation. This is not surprising since deterministic automata normally allow us to switch the accepting and non-accepting
status of their states. We did that with DFAs to show that the class of regular languages is closed under complementation. In the last section, we showed
that the class of CFLs is not closed under complementation. This immediately
implies that these the class of DCFLs cannot be equal to the class of CFLs.
1

At Clarkson, this technique is covered in the courses CS344 Algorithms and Data Structures
or CS447/547 Computer Algorithms.

158

CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES

We can even come up with concrete examples of CFLs that are not DCFLs.
The language L = {an bn cn | n 0} is not context-free. But, in the previous
section, we saw that the complement of L is context-free. If L was a DCFL, then
L would also be a DCFL, contradicting the fact that L is not even context-free.
Therefore, L is a CFL but not a DCFL.
Note that there are restrictions of CFGs that are equivalent to DPDAs and
generate precisely the class of DCFLs. This topic is normally covered in detail in
a course that focuses on the design of compilers.2

Study Questions
8.3.1. What forms of rules can appear in a grammar in CNF?

Exercises
8.3.2. A leftmost derivation is one where every step replaces the leftmost variable
of the current string of variables and terminals. Consider the following
modification of the algorithm described in this section
1. Let w be the input and let n = |w|.
2. Generate every possible leftmost derivation of length 2n 1 and determine if any of them derives w.
Show that this algorithm runs in time 2O(n) . Hint: The similar algorithm
described earlier in this section does not necessarily run in time 2O(n) .
8.3.3. Consider the language of strings of the form ww. Show that the complement of this language is not a DCFL.

At Clarkson, this is CS445/545 Compiler Construction.

Chapter 9
Turing Machines
9.1

Introduction

Recall that one of our goals in these notes is to show that certain computational
problems cannot be solved by any algorithm whatsoever. As explained in Chapter 1 of these notes, this requires that we define precisely what we mean by an
algorithm. In other words, we need a model of computation. Wed like that model
to be simple so we can prove theorems about it. But we also want our model
to be relevant to real-life computation so that our theorems say something that
applies to real-life algorithms.
In Section 2.1, we described the standard model: the Turing machine. Figure 9.1 shows a Turing machine. The control unit of a Turing machine consists
of a transition table and a state. In other words, the control unit of a Turing
machine is essentially a DFA, which means that a Turing machine is essentially a
DFA augmented by memory.
The memory of a Turing machine is a string of symbols. That string is semiinfinite: it has a beginning but no end. At any moment in time, the control unit
has access to one symbol from the memory. We imagine that this is done with a
memory head.

160

CHAPTER 9. TURING MACHINES


memory

control
unit
yes/no

Figure 9.1: A Turing machine


Heres an overview of how a Turing machine operates. Initially, the memory
contains the input string followed by an infinite number of a special symbol
called a blank symbol. The control unit is in its start state. Then, based on
the memory symbol being scanned and the internal state of the control unit, the
Turing machine overwrites the memory symbol, moves the memory head by one
position to the left or right, and changes its state. This is done according to the
transition function of the control unit. The computation halts when the Turing
machine enters one of two special halting states. One of these is an accepting
state while the other is a rejecting state.
Its important to note that the amount of memory a Turing machine can use
is not limited by the length of the input string. In fact, there is no limit to how
much memory a Turing machine can use. A Turing machine is always free to
move beyond its input string and use as much memory as it wants.
In the next section, we will formally define what a Turing machine. In the
mean time, heres an informal example that illustrates the idea. Consider the
language {w#w | w {0, 1} }. The alphabet is {0, 1, #}.
A Turing machine that recognizes this language can be designed as follows.
Suppose that the input is

1011#1011

9.1. INTRODUCTION

161

The Turing machine crosses off the first symbol of the left string and makes sure
it matches the first symbol of the right string. If so, it crosses off that symbol
too:

x011#x011
The Turing machine then crosses off the first non crossed off symbol of the left
string and makes sure it matches the first non crossed off symbol of the right
string. If so, that symbol is crossed off. The computation continues in this way
until all the symbols of either string have been crossed off.
Heres a more precise description of this Turing machine:
1. Scan the input to verify that it is of the form u# v where u, v {0, 1} .
If not, reject. (This can be done because the control unit of the Turing
machine can simulate a DFA for the language {0, 1} #{0, 1} .)
2. Return the head to the beginning of the memory.
3. Cross off the first non crossed off symbol of u. Remember that symbol.
(This can be done by using the internal states of the control unit, just like
a DFA can remember the first symbol of an input string to match it with
the last symbol. See Example 2.11.)
4. Move to the first non crossed off symbol of v. If there is none, reject. If that
symbol doesnt match the last symbol crossed off in u, reject. Otherwise,
cross off the symbol.
5. Repeat Steps 2 to 5 until all symbols in u have been crossed off. When
that happens, verify that all symbols in v have also been crossed off. If so,
accept. Otherwise, reject.
The above description can be called an implementation-level description
of a Turing machine: it describes how the Turing machine uses its memory but
it doesnt specify the states or transition function of the control unit. In the
next section, we will learn how to turn this implementation-level description

162

CHAPTER 9. TURING MACHINES

into a formal description that describes the Turing machine in full detail. Formal
descriptions are precise but very difficult to produce. Later in this chapter, we
will see that the typical operations of modern programming languages can be
implemented by Turing machines. This will allow us to describe Turing machines
using the typical pseudocode we use to describe algorithms. These pseudocode
descriptions will be called high-level descriptions of Turing machines.
Note that it is easy to show that the language {w#w | w {0, 1} } is not
context-free. So the example of this section shows that Turing machines are
more powerful than both DFAs and PDAs (which, as we saw in the last chapter,
are essentially NFAs augmented with a stack).

Exercises
9.1.1. Give an implementation-level description of a TM for the language
{an bn cn | n 0}.
9.1.2. Give an implementation-level description of a TM for the language {ww |
w {0, 1} }.

9.2

Formal Definition

Definition 9.1 A Turing machine is a 7-tuple (Q, , , , q0 , qaccept , qreject ) where


1. Q is a finite set of states.
2. is an alphabet called the input alphabet.
3. is an alphabet called the memory alphabet (or tape alphabet)1 that satisfies the following conditions: contains as well as a special symbol ,
called the blank, that is not contained in .
1

The term tape refers to the magnetic tapes of old computers. Even though it feels outdated,
the term nevertheless captures well the sequential nature of a Turing machines memory.

9.2. FORMAL DEFINITION

163

4. : Q Q {L, R} is the transition function.


5. q0 Q is the starting state.
6. qaccept is the accepting state.
7. qreject is the rejecting state.
We now define how a Turing machine operates. At any moment in time, as
explained in the previous section, we consider that a Turing machine is in a state,
its memory contents consists of a semi-infinite string over and its memory head
is scanning one of the memory symbols. This idea is captured by the following
formal concept. In the remaining definitions of this section, suppose that M =
(Q, , , , q0 , qaccept , qreject ) is a Turing machine.
Definition 9.2 A configuration of M is a triple (q, , i) where
1. q Q is the state of M .
2. is a semi-infinite string over called the memory contents of M .
3. i Z+ is the location of M s memory head.
For example, as explained in the previous section, the starting configuration
of M on input w is (q0 , w . . . , 1), where . . . represents an infinite string of
blanks.
Note that the memory contents in every configuration always ends with
. . . . We will usually omit those blanks. In other words, a configuration such
as (q, u . . . , i) will usually be represented by (q, u, i).
Here are a few other important configurations. (The starting configuration
is repeated for completeness.)
Definition 9.3 The starting configuration of M on input w is (q0 , w, 1). A configuration (q, u, i) is accepting if q = qaccept . A configuration (q, u, i) is rejecting if
q = qreject . A configuration is halting if it is either accepting or rejecting.

164

CHAPTER 9. TURING MACHINES

We are now ready to define what a Turing machine does at every step of its
computation.
Definition 9.4 Suppose that (q, u, i) is a non-halting configuration of M . Let m =
|u| and assume, without loss of generality, that i m. (Otherwise, pad u with
enough blanks.) Suppose that (q, ui ) = (q0 , b, D). Then, in one step, (q, u, i)
leads to the configuration
(q0 , u1 . . . ui1 bui+1 . . . um , j)
where

i + 1 if D = R
j = i 1 if D = L and i > 1

1 if D = L and i = 1
Note that this definition only applies to non-halting configurations. In other
words, we consider that halting configurations do not lead to other configurations. (Thats why theyre called halting. . . )
Definition 9.5 Let C0 be the starting configuration of M on w. Let C1 , C2 , . . . be
the sequence of configurations that C0 leads to. (That is, Ci leads to Ci+1 in one
step, for every i 0.) Then M accepts w if this sequence of configurations is finite
and ends in an accepting configuration.
Definition 9.6 The language recognized by a Turing machine M (or the language
of M ) is the following:
L(M ) = {w | w is accepted by M }
Definition 9.7 A language is recognizable (or Turing-recognizable) if it is recognized by some Turing machine.

9.2. FORMAL DEFINITION

165

Definition 9.8 A Turing machine decides a language if it recognizes the language


and halts on every input. A language is decidable (or Turing-decidable) if it is
decided by some Turing machine.
As mentioned earlier, the Turing machine provides a way to define precisely,
mathematically, what we mean by an algorithm:
An algorithm is a Turing machine that halts on every input.
But this is more than just one possible definition. It turns out that every other
reasonable definition of algorithm that has ever been proposed has been shown
to be equivalent to the Turing machine definition. (We will see some evidence
for this later in this chapter.) In other words, the Turing machine appears to
be the only definition possible. This phenomenon is known as the Church-Turing
Thesis. In other words, the Turing machine definition of an algorithm is more
than just a definition: it can be viewed as a basic law or axiom of computing.
The Church-Turing Thesis has an important consequence: by eliminating the
need for competing notions of algorithms, it brings simplicity and clarity to the
theory of computation.
We end this section with the formal description of a Turing machine for
the language {w#w | w {0, 1} }. The description is given in Figure 9.2 and
mostly follows the implementation-level description we gave in the previous
section. In this diagram, all missing transitions are assumed to go to the rejecting
state. The transition labeled 1 x, R from state q0 to state q1 means that
(q0 , 1) = (q1 , x, R). The transition labeled 0, 1 R from state q1 back to state
q1 means that (q1 , 0) = (q1 , 0, R) and that (q1 , 1) = (q1 , 1, R).

Exercises
9.2.1. Give formal descriptions of Turing machines for the following languages.
In both cases, the alphabet is {0, 1}.

166

CHAPTER 9. TURING MACHINES

0, 1 L
xR

0, 1 R
q0

#R
xR

q7
R

1 x, R

q1

q4

#R

q2

0 x, R
q5

0, 1 R

#L

xR
1 x, L

q3

xL

0 x, L

#R

q6

xR

qaccept

Figure 9.2: A Turing machine for the language {w#w | w {0, 1} }

9.3. VARIATIONS ON THE BASIC TURING MACHINE

167

a) The language of strings of length at least two that begin and end with
the same symbol.
b) The language of strings of length at least two that end in 00.
9.2.2. Give a formal description of a Turing machine for the language {an bn cn |
n 0}.

9.3

Variations on the Basic Turing Machine

In the previous section, we defined what we will call our basic Turing machine.
In this section, we define two variations and show that they are equivalent to the
basic model. This will constitute some evidence in support of the Church-Turing
Thesis.
The first variation is fairly minor. The basic Turing machine must move its
head left or right at every move. A stay option would allow the Turing machine
to not move its head. This can be easily incorporated into the formal definition
of the Turing machine by extending the definition of the transition function:
: Q Q {L, R, S}
The definition of a move must also be extended by adding the following case:
j=i

if D = S

It is clear that a basic Turing machine can be simulated by a Turing machine


with the stay option: that option is simply not used.
The reverse is also easy to establish. Suppose that M is a Turing machine
with the stay option. A basic Turing machine M that simulates M 0 can be constructed as follows. For every left and right move, M 0 operates exactly as M .
Now suppose that in M , (q1 , a) = (q2 , b, S). Then add a new state q1,a to M 0
and the following transitions:
(q1 , a) = (q1,a , b, R)

168

CHAPTER 9. TURING MACHINES


(q1,a , c) = (q2 , c, L), for every c

This allows M 0 to simulate every stay move of M with two left/right moves.
The second variation of the basic Turing machine we will consider in this
section is more significant. The basic Turing machine has one memory string. A
multitape Turing machine is a Turing machine that uses more than one memory
string. Initially, the first memory string contains the input string followed by
an infinite number of blanks. The other memory strings are completely blank.
The Turing machine has one memory head for each memory string and it can
operate each of those heads independently. Note that the number of tapes must
be a fixed number, independent of the input string. (A detailed, formal definition
of the multitape Turing machine is left as an exercise.)
The basic, single-tape Turing machine is a special case of the multitape Turing
machine. Therefore, to establish that the two models are equivalent, all we need
to show is that single-tape Turing machines can simulate the multitape version.
Theorem 9.9 Every multitape Turing machine has an equivalent single-tape Turing machine.
Proof Suppose that M is a k-tape Turing machine. We construct a single-tape
Turing machine M 0 that simulates M as follows. Each of the memory symbols of
M 0 represents k of the memory symbols of M . For example, if M had three tapes
containing x 1 x 2 x 3 . . ., y1 y2 y3 . . . and z1 z2 z3 . . ., respectively, then the single tape
of M 0 would contain

#(x 1 , y1 , z1 )(x 2 , y2 , z2 )(x 3 , y3 , z3 ) . . .


The # symbol at the beginning of the tape will allow M 0 to detect the beginning
of the memory.
The single-tape machine also needs to keep track of the location of all the
memory heads of the multitape machine. This can be done by introducing underlined versions of each tape symbol of M . For example, if one of the memory

9.3. VARIATIONS ON THE BASIC TURING MACHINE

169

heads of M is scanning a 1, then, in the memory of M 0 , the corresponding 1 will


be underlined: 1.
Here is how M 0 operates:
1. Initialize the tape so it corresponds to the initial configuration of M .
2. Scan the tape from left to right to record the k symbols currently being
scanned by M . (The scanning can stop as soon as the k underlined symbols
have been seen.)
3. Scan the tape from right to left, updating the scanned symbols and head
locations according to the transition function of M .
4. Repeat Steps 2 and 3 until M halts. If M accepts, accept. Otherwise, reject.
It should be clear that M 0 accepts exactly the same strings as M .

t
u

Exercises
9.3.1. Give a formal definition of multitape Turing machines.
9.3.2. The memory string of the basic Turing machine is semi-infinite because it
is infinite in only one direction. Say that a memory string is doubly infinite
if it is infinite in both directions. Show that Turing machines with doubly
infinite memory are equivalent to the basic Turing machine.
9.3.3. Show that the class of decidable languages is closed under complementation, union, intersection and concatenation.
9.3.4. Consider the language of strings of the form x # y #z where x, y and z are
strings of digits of the same length that when viewed as numbers, satisfy
the equation x + y = z. For example, the string 123#047#170 is in this
language because 123 + 47 = 170. The alphabet is {0, 1, . . . , 9, #}. Show
that this language is decidable. (Recall that according to Exercise 5.2.3,

170

CHAPTER 9. TURING MACHINES


this language is not regular. Its also possible to show that its no contextfree either.)

9.3.5. A PDA is an NFA augmented by a stack. Let a 2-PDA be a DFA augmented


by two stacks. We know that PDAs recognize exactly the class of CFLs. In
contrast, show that 2-PDAs are equivalent to Turing machines.

9.4

Equivalence with Programs

As mentioned earlier, the Church-Turing Thesis states that every reasonable notion of algorithm is equivalent to a Turing machine that halts on every input.
While it is not possible to prove this statement, there is a lot of evidence in
support of it. In fact, every reasonable notion of algorithm that has ever been
proposed has been shown to be equivalent to the Turing machine version. We
saw two examples of this in the previous section, when we showed that Turing
machines with the stay option and multitape Turing machines are equivalent to
the basic model.
In this section, we go further and claim that programs written in a highlevel programming language are also equivalent to Turing machines. We wont
prove this claim in detail because the complexity of high-level programming
languages would make such a proof incredibly time-consuming and tedious. But
it is possible to get a pretty good sense of why this equivalence holds.
To make things concrete, lets focus on C++. (Or pick your favorite high-level
programming language.) First, it should be clear that every Turing machine has
an equivalent C++ program. To simulate a Turing machine, a C++ program only
needs to keep track of the state, memory contents and memory head location
of the machine. At every step, depending on the state and memory symbol currently being scanned, and according to the transition function of the machine,
the program updates the state, memory contents and head location. The simulation starts in the initial configuration of the machine and ends when a halting
state is reached.

9.4. EQUIVALENCE WITH PROGRAMS

171

For the reverse direction, we need to show that every C++ program has an
equivalent Turing machine. This is not easy because C++ is a complex language
with an enormous number of different features. But we have compilers that
can translate C++ programs into assembler. Assuming that we trust that those
compilers are correct, then all we need to show is that Turing machines can
simulate programs written in assembler. We wont do this in detail, but because
assembler languages are fairly simple, its not hard to see that it could done.
As mentioned in Section 2.1, assembler programs have no variables, no functions, no types and only very simple instructions. An assembler program is a linear sequence of instructions, with no nesting. These instructions directly access
data that is stored in memory or in a small set of registers. One of these registers is the program counter, or instruction counter, that keeps track of which
instruction is currently being executed. Here are some typical instructions:
Set the contents of a memory location to a given value.
Copy the contents of a memory location to another one.
Add the contents of a memory location to the contents of another one.
Jump to another instruction if the contents of a memory location is 0.
To these instructions, we have to add some form of indirect addressing, that
is, the fact that a memory location may be used as a pointer that contains the
address of another memory location.
To simulate an assembler program, a Turing machine needs to keep track of
the contents of the memory and of the various registers. This can be done with
one tape for the memory and one tape for each of the registers. Then, each instruction in the assembler program will be simulated by a group of states within
the Turing machine. When the execution of an instruction is over, the machine
transitions to the first state of the group that simulates the next instruction.
Because assembler instructions are simple, its fairly easy to see how a Turing
machine can simulate them. For example, consider an instruction that sets the

172

CHAPTER 9. TURING MACHINES

contents of memory location i to the value x. Assume that x is a 32-bit number


and that the instruction sets the 32 bits that begin at memory location i. And
note that i and x are actual numbers, not variables. An example of such an
instruction would be, set the contents of memory location 17 to the value 63.
This can be simulated as follows:
1. Move the memory head to location i. (This can be done by scanning the
memory from left to right and using the states to count.)
2. Write the 32 bits of x to the 32 bits that start at the current memory location.
As can be seen from the above, while executing instructions, the Turing machine
will need to use a small number of additional tapes to store temporary values.
The details of the subtraction in Step 2 are left as an exercise similar to one from
the previous section.
Now, suppose that we add indirect addressing to this. Heres how we can
simulate an instruction that sets to x the contents of the memory location whose
address is stored at memory location i. Were assuming that memory addresses
have 32 bits.
1. Move the memory head to location i.
2. Copy the 32 bits that start at the current memory location to an extra tape.
Call this value j.
3. Scan the memory from left to right. For each symbol, subtract 1 from j.
When j becomes 1, stop. (The head is now at memory location j.)
4. Write the 32 bits of x to the 32 bits that start at the current memory location.
Other assembler instructions can be simulated in a similar way. This makes
it pretty clear that Turing machines can simulate assembler programs. And, as

9.4. EQUIVALENCE WITH PROGRAMS

173

explained earlier, by combining this with the fact that compilers can translate
C++ into assembler, we get that Turing machines can simulate C++ programs.
Up until now, all the descriptions we have given of Turing machines have
been either implementation-level descriptions, such as those in this section, or
formal descriptions, such as the one in Figure 9.2. Now that we have convincing evidence that Turing machines can simulate the constructs of high-level programming languages, from now on, we will usually describe Turing machines
in pseudocode. We will say that these are high-level descriptions of Turing
machines. (From now on, unless otherwise indicated, you should assume that
exercises ask for high-level descriptions of Turing machines.)

Exercises
9.4.1. Describe how a Turing machine can simulate each of the following assembler instructions. Give implementation level descriptions. In each case,
assume that 32 bits are copied, added or tested.
a) Copy the contents of memory location i to memory location j.
b) Add the contents of memory location i to the contents of memory
location j.
c) Jump to another instruction if the contents of a memory location i
is 0.
9.4.2. Show that the class of decidable languages is closed under the star operation.

174

CHAPTER 9. TURING MACHINES

Chapter 10
Problems Concerning Formal
Languages
10.1

Regular Languages

The acceptance problem for DFAs is as follows: given a DFA M and an input
string w, determine if M accepts w. In order to formalize this as a language,
let M , w be some reasonable encoding of pairs that consist of a DFA and an
input string. There are many different ways in which this could be done and the
details are not important. Then the acceptance problem for DFAs corresponds
to the language ADFA of strings of the form M , w where M is a DFA, w is a string
over the input alphabet of M , and M accepts w.
The acceptance problem for DFAs can be easily decided as follows:
1. Verify that the input string is of the form M , w where M is a DFA and w
is a string over the input alphabet of M . If not, reject.
2. Simulate M on w by using the algorithm of Figure 2.9.
3. If the simulation ends in an accepting state, accept. Otherwise, reject.

176

CHAPTER 10. PROBLEMS CONCERNING FORMAL LANGUAGES

Note that this immediately implies that every regular language is decidable:
if L is a regular language, an algorithm for L would simply simulate a DFA for L.
We can also define an acceptance problem for NFAs. This corresponds to the
language ANFA of strings of the form N , w where N is an NFA, w is a string over
the input alphabet of N , and N accepts w.
This easiest way to show that this problem is decidable is to convert the NFA
into a DFA:
1. Verify that the input string is of the form N , w where N is an NFA and w
is a string over the input alphabet of N . If not, reject.
2. Convert N into a DFA M by using the algorithm of Section 3.3.
3. Determine if M accepts w by using the algorithm for the acceptance problem for DFAs.
4. Accept if that algorithm accepts. Otherwise, reject.
The acceptance problem AREX for regular expressions can be handled in a
similar way:
1. Verify that the input string is of the form R, w where R is a regular expression and w is a string over the alphabet of R. If not, reject.
2. Convert R into an NFA N by using the algorithm of Section 4.4.
3. Determine if N accepts w by using the algorithm for the acceptance problem for NFAs.
4. Accept if that algorithm accepts. Otherwise, reject.
We now turn to a different type of problem: the emptiness problem for DFAs.
This corresponds to the language EDFA of strings of the form M where M is
a DFA and L(M ) is empty. This problem is equivalent to a graph reachability
problem: we want to determine if there is a path from the starting state of M to
any of its accepting states. A simple marking algorithm can do the job:

10.1. REGULAR LANGUAGES

177

1. Verify that the input string is of the form M where M is a DFA. If not,
reject.
2. Mark the starting state of M .
3. Mark any state that can be reached with one transition from a state thats
already marked.
4. Repeat Step 3 until no new states get marked.
5. If an accepting state is marked, reject. Otherwise, accept.
This algorithm is a form of breadth-first search. Its correctness is not hard
to establish. First, it is clear that if a state is marked, then that state is reachable
from the starting state. Second, its not hard to show that if a state can be
reached from the starting state with a path of length k, then that state will be
marked after no more than k iterations of the loop in the algorithm.
The equivalence problem for DFAs is the language EQDFA of strings of the form
M1 , M2 where M1 and M2 are DFAs and L(M1 ) = L(M2 ). A clever solution
to this problem comes from considering the symmetric difference of the two
languages:
L(M1 ) L(M2 ) = (L(M1 ) L(M2 )) (L(M1 ) L(M2 ))
The first key observation is that the symmetric difference is empty if and only
if the two languages are equal. The second key observation is that the symmetric difference is regular. Thats because the class of regular languages is
closed under complementation, intersection and union. In addition, those closure properties are constructive: we have algorithms that can produce DFAs for
those languages.
Here is an equivalence algorithm that uses all of those ideas:
1. Verify that the input string is of the form M1 , M2 where M1 and M2 are
DFAs. If not, reject.

178

CHAPTER 10. PROBLEMS CONCERNING FORMAL LANGUAGES

2. Construct a DFA M for the language L(M1 ) L(M2 ) by using the algorithms
of Section 2.5.
3. Test if L(M ) = ; by using the emptiness algorithm.
4. Accept if that algorithm accepts. Otherwise, reject.
The algorithms of this section give us the following:
Theorem 10.1 The languages ADFA , ANFA , AREX , EDFA and EQDFA are all decidable.
Corollary 10.2 Every regular language is decidable.

Exercises
10.1.1. Let ALLDFA be the language of strings of the form M where M is a DFA
that accepts every possible string over its input alphabet. Show that ALLDFA
is decidable.
10.1.2. Let SUBSETREX be the language of strings of the form R1 , R2 where R1
and R2 are regular expressions and L(R1 ) L(R2 ). Show that SUBSETREX
is decidable.
10.1.3. Consider the language of strings of the form M where M is a DFA
that accepts at least one string of odd length. Show that this language is
decidable.

10.2

CFLs

The acceptance problem for CFGs is the language ACFG of strings of the form
G, w where G is a CFG, w is a string of terminals, and G derives w.
An algorithm for ACFG can be designed in a straightforward way:

10.2. CFLS

179

1. Verify that the input string is of the form G, w where G is a CFG and w
is a string of terminals. If not, reject.
2. Convert G into an equivalent CFG G 0 in Chomsky Normal Form, by using
the algorithm mentioned in Section 8.3.
3. Determine if G 0 derives w by using the CFL algorithm of Section 8.3.
4. Accept if that algorithm accepts. Otherwise, reject.
The CFL algorithms of Section 8.3 were not described in detail but this can be
done and it can be shown that these algorithms can be implemented by Turing
machines. In particular, this implies that every CFL is decidable.
The emptiness problem for CFGs is the language ECFG of strings of the form
G where G is a CFG and L(G) is empty. Just like the emptiness problem for
DFAs, the emptiness problem for CFGs can be solved be a simple marking algorithm. The idea is to mark variables that can derive a string consisting entirely
of terminals.
1. Verify that the input string is of the form G where G is a CFG. If not,
reject.
2. Mark all the terminals of G.
3. Mark any variable A for which G has a rule A U1 Uk where each Ui
has already been marked.
4. Repeat Step 3 until no new variables get marked.
5. If the start variable of G is marked, reject. Otherwise, accept.
The correctness of this algorithm is not hard to establish. First, it is clear
that if a variable is marked, then that variable can derive a string of terminals.
Second, its not hard to show that if a variable can derive a string of terminals

180

CHAPTER 10. PROBLEMS CONCERNING FORMAL LANGUAGES

with a parse tree of height h, then that variable will be marked after no more
than h iterations of the loop in the algorithm.
Therefore, we have shown the following:
Theorem 10.3 The languages ACFG and ECFG are decidable.
Corollary 10.4 Every CFL is decidable.
What about the equivalence problem for CFGs? This is the language EQCFG of
strings of the form G1 , G2 where G1 and G2 are CFGs and L(G1 ) = L(G2 ). The
strategy we used in the previous section for DFAs does not work here because
the class of CFLs is not closed under complementation or intersection. In fact, it
turns out that EQCFG is undecidable: there is no Turing machine that can decide
this language. Note that by the Church-Turing Thesis, this means that there is
no algorithm of any kind that can decide this language. We will learn how to
prove undecidability results in the next chapter.

Exercises
10.2.1. Let DERIVES"CFG be the language of strings of the form G where G is
a CFG that derives ". Show that DERIVES"CFG is decidable.
10.2.2. Consider the language of strings of the form G where G is a CFG that
accepts at least one string of odd length. Show that this language is decidable.
10.2.3. Let INFINITECFG be the language of strings of the form G where G is a
CFG that derives an infinite number of strings. Show that INFINITECFG is
decidable. Hint: Use the Pumping Lemma.

Chapter 11
Undecidability
11.1

An Unrecognizable Language

In this section, we show that for every alphabet , there is at least one language
that cannot recognized by any Turing machine. The proof of this result uses the
diagonalization technique.
Theorem 11.1 Over every alphabet , there is a language that is not recognizable.
Proof Let s1 , s2 , s3 , . . . be a lexicographic listing of all the strings in .1 Let
M1 , M2 , M3 , . . . be a lexicographic listing of all the Turing machines with input
alphabet . This list can be obtained from a lexicographic listing of all the valid
encodings of Turing machines.
The languages of all these machines can be described by a table with a row
for each machine and a column for each string. The entry for machine Mi and
1

In common usage, the adjective lexicographic relates to dictionaries, or their making. In the
context of formal languages, however, the alphabetical ordering used in dictionaries is problematic. For example, an alphabetical listing of all strings over the alphabet {0, 1} would be ", 0,
00, 000, . . . Strings with 1s dont get listed. Instead, in a lexicographic ordering, smaller strings
are listed first. Over the alphabet {0, 1}, for example, this gives ", 0, 1, 00, 01, 10, 11, 000, . . .

182

CHAPTER 11. UNDECIDABILITY


s1 s2 s3
M1 A N A
M2 A

A N

M3 N A N
..
.. .. ..
.
. . .
Figure 11.1: The table of acceptance for all Turing machines over
string s j indicates whether Mi accepts s j . Figure 11.1 shows an example where
A indicates acceptance and N indicates nonacceptance.
Our objective is to find a language L that cannot be recognized by any of
these Turing machines. In other words, we have to guarantee that for each
Turing machine Mi , there is are least one input string s j on which Mi and the
language disagree. That is, a string s j that is contained in L but rejected by Mi ,
or that is not contained in L but accepted by Mi .
To achieve this, note that the diagonal of the above table defines the following language:
D = {si | Mi accepts si }
Now consider the complement of D:
D = {si | Mi does not accept si }
This language is defined by the opposite of the entries on the diagonal. We claim
that D is not recognizable.
Suppose, for the sake of contradiction, that D is recognizable. Then it must
be recognized by one of these machines. Suppose that D = L(Mi ). Now, consider
what Mi does when it runs on si .
If Mi accepts si , then, by the definition of D, si
/ D. But then, since Mi
recognizes D, Mi does not accept si . Thats a contradiction.

11.2. NATURAL UNDECIDABLE LANGUAGES

183

On the other hand, if Mi does not accept si , then the definition of D implies
that si D. But since Mi recognizes D, this implies that Mi accepts si . Thats
also a contradiction.
Therefore, in either case, we get a contradiction. This implies that D is not
recognizable.
t
u
Note that since D is not recognizable, it is of course also undecidable. But
this also implies that D is undecidable since the class of decidable languages is
closed under complementation.

Exercises
11.1.1. Suppose that is an alphabet over which an encoding of Turing machines can be defined.2 Consider the language L of strings of the form
M where M is a Turing machine with input alphabet and M does not
accept the string M . Show that the language L is not recognizable.

11.2

Natural Undecidable Languages

In the previous section, we showed that unrecognizable languages exist. But the
particular languages we considered were somewhat artificial, in the sense that
they were constructed specifically for the purposes of proving this result. In this
section, we consider two languages that are arguably more natural and show
that they are undecidable.
The first language is the acceptance problem for Turing machines. This is the
language ATM of strings of the form M , w where M is a Turing machine, w is a
string over the input alphabet of M and M accepts w.
2

It turns out that this can be any alphabet. Its not hard to see that Turing machines can be
described in every alphabet of size at least 2. Its just a little trickier to show that it can also be
done over alphabets of size 1.

184

CHAPTER 11. UNDECIDABILITY

Theorem 11.2 The language ATM is undecidable.


Proof By contradiction. Suppose that algorithm R decides ATM . We use this
algorithm to design an algorithm S for the language D defined in the proof of
Theorem 11.1. Recall that
D = {si | Mi does not accept si }
Let be the alphabet of ATM . Heres a description of S:
1. Let w be the input string. Find i such that w = si . (This can be done by
listing the all the strings s1 , s2 , s3 , . . . until w is found.)
2. Generate the encoding of machine Mi . (This can be done by listing all the
valid encodings of machines with input alphabet .)
3. Run R on Mi , si .
4. If R accepts, reject. Otherwise, accept.
It is easy to see that S decides D. Since this language is not even recognizable, this is a contradiction. This shows that R cannot exist and that ATM is
undecidable.
u
t
The second language we consider in this section is the famous Halting Problem we mentioned in the introduction of these notes. This is the language
HALTTM of strings of the form M , w where M is a Turing machine, w is a string
over the input alphabet of M and M halts on w.
Theorem 11.3 The language HALTTM is undecidable.
Proof By contradiction. Suppose that algorithm R decides HALTTM . We use this
algorithm to design an algorithm S for the acceptance problem:

11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES

185

1. Verify that the input string is of the form M , w where M is a DFA and w
is a string over the input alphabet of M . If not, reject.
2. Run R on M , w.
3. If R rejects, reject. Otherwise, simulate M on w.
4. If M accepts, accept. Otherwise, reject.
It is easy to see that S decides ATM . Since this language is undecidable, this
is a contradiction. Therefore, R cannot exist and HALTTM is undecidable.
t
u

Exercises
11.2.1. Show that the language D defined in the proof of Theorem 11.1 is recognizable.

11.3

Reducibility and Additional Examples

The two undecidability proofs of the previous section followed the same pattern,
which goes like this. To show that a language B is undecidable, start by assuming
the opposite, that there is an algorithm R that decides B. Then use R to design
an algorithm S that decides a language A that is known to be undecidable. This
is a contradiction, which shows that R cannot exist and that B is undecidable.
When an algorithm for A can be constructed from an algorithm for B, we say
that A reduces to B. Reductions can be used in two ways.
1. To prove decidability results: if B is decidable, then a reduction of A to B
shows that A too is decidable.
2. To prove undecidability results: if A is undecidable, then a reduction of A to
B shows that B too is undecidable (since otherwise A would be decidable,
a contradiction).

186

CHAPTER 11. UNDECIDABILITY

In the previous section, to prove that ATM is undecidable, we reduced D to


ATM . To prove that HALTTM is undecidable, we reduced ATM to HALTTM . In
this section, we will prove four additional undecidability results by using the
reducibility technique.
Consider the language WRITES_ON_TAPETM of strings of the form M , w, a
where M is a Turing machine, w is a string over the input alphabet of M , a is a
symbol in the tape alphabet of M , and M writes an a on its tape while running
on w.
Theorem 11.4 The language WRITES_ON_TAPETM is undecidable.
Proof By contradiction. Suppose that algorithm R decides the language
WRITES_ON_TAPETM . We use this algorithm to design an algorithm S for the
acceptance problem.
The idea behind the design of S is that when given a Turing machine M and
an input string w, S constructs a new Turing machine M 0 that is guaranteed to
write some symbol a on its tape if and only if M accepts w.
Heres the description of S:
1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Let a be a new symbol not in the tape alphabet of M . Construct the following Turing machine M 0 :
(a) Let x be the input string. Run M on x.
(b) If M accepts, write an a on the tape.
3. Run R on M 0 , w, a.
4. If R accepts, accept. Otherwise, reject.

11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES

187

To prove that S decides ATM , first suppose that M accepts w. Then when M 0
runs on w, it writes an a on its tape. This implies that R accepts M 0 , w, a and
that S accepts M , w, which is what we want.
Second, suppose that M does not accept w. Then when M 0 runs on w, it
never writes an a on its tape. This implies that R rejects M 0 , w, a and that S
rejects M , w. Therefore, S decides ATM .
However, since ATM is undecidable, this is a contradiction. Therefore, R does
not exist and WRITES_ON_TAPETM is undecidable.
u
t
The emptiness problem for Turing machines is the language ETM of strings of
the form M where M is a Turing machine and L(M ) = ;.
Theorem 11.5 The language ETM is undecidable.
Proof By contradiction. Suppose that algorithm R decides ETM . We use this
algorithm to design an algorithm S for the acceptance problem. The idea is that
given a Turing machine M and a string w, S constructs a new Turing machine
M 0 whose language is empty if and only if M does not accept w.
1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Construct the following Turing machine M 0 :
(a) Let x be the input string. If x 6= w, reject.
(b) Run M on w.
(c) If M accepts, accept. Otherwise, reject.
3. Run R on M 0 .
4. If R accepts, reject. Otherwise, accept.

188

CHAPTER 11. UNDECIDABILITY

To prove that S decides ATM , first suppose that M accepts w. Then L(M 0 ) =
{w}, which implies that R rejects M 0 and that S accepts M , w. On the other
hand, suppose that M does not accept w. Then L(M 0 ) = ;, which implies that R
accepts M 0 and that S rejects M , w.
Since ATM is undecidable, this is a contradiction. Therefore, R does not exist
and ETM is undecidable.
t
u
Heres a different version of this proof. The main difference in the definition
of Turing machine M 0 .
Proof By contradiction. Suppose that algorithm R decides ETM . We use this
algorithm to design an algorithm S for the acceptance problem. The idea is the
same as in the previous proof: given a Turing machine M and a string w, S
constructs a new Turing machine M 0 whose language is empty if and only if M
does not accept w.
1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Construct the following Turing machine M 0 :
(a) Let x be the input string.
(b) Run M on w.
(c) If M accepts, accept. Otherwise, reject.
3. Run R on M 0 .
4. If R accepts, reject. Otherwise, accept.
To prove that S decides ATM , first suppose that M accepts w. Then L(M 0 ) =
, where is the input alphabet of M 0 , which implies that R rejects M 0 and
that S accepts M , w. On the other hand, suppose that M does not accept w.
Then L(M 0 ) = ;, which implies that R accepts M 0 and that S rejects M , w.

11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES

189

Since ATM is undecidable, this is a contradiction. Therefore, R does not exist


and ETM is undecidable.
t
u
The equivalence problem for Turing machines is the language EQTM of strings
of the form M1 , M2 where M1 and M2 are Turing machines and L(M1 ) = L(M2 ).
Theorem 11.6 The language EQTM is undecidable.
Proof This time, we will do a reduction from a language other than ATM . Suppose that algorithm R decides EQTM . We use this algorithm to design an algorithm S for the emptiness problem:
1. Verify that the input string is of the form M where M is a Turing machine.
If not, reject.
2. Construct a Turing machine M 0 that rejects every input.
3. Run R on M , M 0 .
4. If R accepts, accept. Otherwise, reject.
This algorithm accepts M if and only if L(M ) = L(M 0 ) = ;. Therefore, S
decides ETM , which is contradicts the fact that ETM is undecidable. This proves
that R does not exist and that EQTM is undecidable.
u
t
The regularity problem for Turing machines is to determine if the language of
a given Turing machine is regular. This corresponds to the language REGULARTM
of strings of the form M where M is a Turing machine and L(M ) is regular.
Theorem 11.7 The language REGULARTM is undecidable.
Proof By contradiction. Suppose that algorithm R decides REGULARTM . We
use this algorithm to design an algorithm S for the acceptance problem. Given a
Turing machine M and a string w, S constructs a new Turing machine M 0 whose
language is regular if and only if M accepts w.

190

CHAPTER 11. UNDECIDABILITY

1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Construct the following Turing machine M 0 with input alphabet {0, 1}:
(a) Let x be the input string. If x {0n 1n | n 0}, accept.
(b) Otherwise, run M on w.
(c) If M accepts, accept. Otherwise, reject.
3. Run R on M 0 .
4. If R accepts, accept. Otherwise, reject.
Suppose that M accepts w. Then L(M 0 ) = {0, 1} , which implies that R
accepts M 0 and that S accepts M , w. On the other hand, suppose that M
does not accept w. Then L(M 0 ) = {0n 1n | n 0}, which implies that R rejects
M 0 and that S rejects M , w. Therefore, S decides ATM .
Since ATM is undecidable, this is a contradiction. Therefore, R does not exist
and REGULARTM is undecidable.
u
t

Exercises
11.3.1. Consider the problem of detecting if a Turing machine M ever attempts
to move left from the first position of its tape while running on w. This
corresponds to the language BUMPS_OFF_LEFTTM of strings of the form
M , w where M is a Turing machine, w is a string over the input alphabet
of M and M attempts to move left from the first position of its tape while
running on w. Show that BUMPS_OFF_LEFTTM is undecidable.
11.3.2. Consider the problem of determining if a Turing machine M ever enters state q while running on w. This corresponds to the language

11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES

191

ENTERS_STATETM of strings of the form M , w, q where M is a Turing machine, w is a string over the input alphabet of M , q is a state of M and M
enters q while running on w. Show that ENTERS_STATETM is undecidable.
11.3.3. Consider the problem of determining if a Turing machine M ever
changes the contents of memory location i while running on an input w.
This corresponds to the language of strings of the form M , w, i where
M is a Turing machine, w is a string over the input alphabet of M , i is a
positive integer and M changes the value of the symbol at position i of the
memory while running on w. Show that this language is undecidable.
11.3.4. Let ACCEPTS"TM be the language of strings of the form M where M is
a Turing machine that accepts the empty string. Show that ACCEPTS"TM
is undecidable.
11.3.5. Let HALTS_ON_ALLTM be the language of strings of the form M where
M is a Turing machine that halts on every input string. Show that
HALTS_ON_ALLTM is undecidable. Hint: Reduce from HALTTM .
11.3.6. Let ALLCFG be the language of strings of the form G where G is a CFG
over some alphabet and L(G) = . It is possible to show that ALLCFG is
undecidable. Use this result to show that EQCFG is undecidable.
11.3.7. In the proof of Theorem 11.7, modify the construction of M 0 as follows:
2. Construct the following Turing machine M 0 with input alphabet
{0, 1}:
(a) Let x be the input string. If x
/ {0n 1n | n 0}, reject.
(b) Otherwise, run M on w.
(c) If M accepts, accept. Otherwise, reject.
Can the proof of Theorem 11.7 be made to work with this alternate construction of M 0 ? Explain.

192

CHAPTER 11. UNDECIDABILITY

11.3.8. Let INFINITETM be the language of strings of the form M where M is


a Turing machine and L(M ) is infinite. Show that INFINITETM is undecidable.
11.3.9. Let DECIDABLETM be the language of strings of the form M where M
is a Turing machine and L(M ) is decidable. Show that DECIDABLETM is
undecidable.

11.4

Rices Theorem

In the previous section, we showed that the emptiness and regularity problems
for Turing machines are undecidable. Formally, these problems correspond to
the languages ETM and REGULARTM . In addition, in the exercises of that section,
you were asked to show that ACCEPTS"TM , INFINITETM and DECIDABLETM are
also undecidable.
These undecidability proofs have a lot in common. And this comes, in part,
from the fact that these problems are of a common type: they each involve
determining if the language of a Turing machine satisfies a certain property,
such as being empty, being regular or containing the empty string.
It turns out that we can exploit this commonality to generalize these undecidability results. First, we need to define precisely the general concepts were
dealing with.
Definition 11.8 A property of recognizable languages is a unary predicate on
the set of recognizable languages.
In other words, a property of recognizable languages is a Boolean function
whose domain is the set of recognizable languages. That is, a function that is
true for some recognizable languages, and false for the others. For example, the
emptiness property is the function that is true for the empty language and false
for all other recognizable languages.

11.4. RICES THEOREM

193

Now, to each property P of recognizable languages, we can associate a language: the language L P of Turing machines whose language satisfies property P:
L P = {M | M is a TM and P(L(M )) is true}
All of the undecidable languages we mentioned at the beginning of this section
are of this form. For example, ETM = L P where P is the property of being empty.
The general result we are going to prove in this section is that every language of this form is undecidable. That is, if P is any property of recognizable
languages, then L P undecidable.
Now, there are two obvious exceptions to this result: the two trivial properties, the one that is true for all recognizable languages, and the one that is false
for all recognizable languages. These properties can be decided by algorithms
that always accept or always reject every Turing machine. So our general result
will only apply to non-trivial properties of recognizable languages.
Definition 11.9 A property P of recognizable languages is trivial if it is true for
all recognizable languages or if it is false for all recognizable languages.
When a language of the form L P is undecidable, we say that the property P
itself is undecidable.
Definition 11.10 A property P of recognizable languages is undecidable if the
language L P is undecidable.
Were now ready to state the main result of this section:
Theorem 11.11 (Rices Theorem) Every non-trivial property of recognizable
languages is undecidable.
Proof Suppose that P is a non-trivial property of recognizable languages. First
assume that the empty language satisfies property P. That is, P(;) is true. Since

194

CHAPTER 11. UNDECIDABILITY

P is non-trivial, there is a recognizable language L that doesnt satisfy P. Let M L


be a Turing machine that recognizes L.
Now, by contradiction, suppose that algorithm R decides L P . Heres how we
can use this algorithm to design an algorithm S for the acceptance problem.
Given a Turing machine M and a string w, S constructs a new Turing machine
M 0 whose language whose language is either empty or L depending on whether
M accepts w.
1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Construct the following Turing machine M 0 . The input alphabet of M 0 is
the same as that of M L .
(a) Let x be the input string and run M L on x.
(b) If M L rejects, reject.
(c) Otherwise (if M L accepts), run M on w.
(d) If M accepts, accept. Otherwise, reject.
3. Run R on M 0 .
4. If R accepts, reject. Otherwise, accept.
To show that S correctly decides the acceptance problem, suppose first that
M accepts w. Then L(M 0 ) = L. Since L does not have property P, R rejects M 0
and S correctly accepts M , w.
On the other hand, suppose that M does not accept w. Then L(M 0 ) = ;,
which does have property P. Therefore, R accepts M 0 and S correctly rejects
M , w.
This shows that S decides ATM . Since this language is undecidable, this
proves that R does not exist and that L P is undecidable.

11.5. NATURAL UNRECOGNIZABLE LANGUAGES

195

Recall that this was all done under the assumption that the empty language
has property P. Suppose now that this is not the case. Consider P, the opposite
of property P. The empty language has property P and this property is also
non-trivial. The above argument can then be used to show that the language L P
is undecidable. By the closure properties, this means that L P is undecidable in
this case too.
t
u

Exercises
11.4.1. In the proof of Rices Theorem, modify the construction of M 0 as follows:
2. Construct the following Turing machine M 0 . The input alphabet of
M 0 is the same as that of M L .
(a) Let x be the input string. Run M on w.
(b) If M rejects, reject.
(c) Otherwise (if M accepts), run M L on x.
(d) If M L accepts, accept. Otherwise, reject.
Does the proof of Rices Theorem work with this alternate construction of
M 0 ? Explain.

11.5

Natural Unrecognizable Languages

We have now seen several examples of natural undecidable languages. But we


only have one example of an unrecognizable language: the complement of the
diagonal language D. In this section, we will learn a technique for proving that
certain more natural languages are unrecognizable.
We need a candidate. A first idea would be the acceptance problem. But it
turns out that ATM is recognizable:

196

CHAPTER 11. UNDECIDABILITY

1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Simulate M on w.
3. If M accepts, accept. Otherwise, reject.
This Turing machine is called the Universal Turing Machine. It is essentially
a Turing machine interpreter.
What about the complement of the acceptance problem? It turns out that if
ATM was recognizable, then ATM would be decidable, which we know is not the
case.
Theorem 11.12 The language ATM is not recognizable.
Proof Suppose ATM is recognizable. Let R be a Turing machine that recognizes
this language. Let U be the Universal Turing Machine. We use both R and U to
design an algorithm S for the acceptance problem:
1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Run R and U at the same time on M , w. (This can be done by alternating
between R and U, one step at a time.)
3. If R accepts, reject. If R rejects, accept. If U accepts, accept. If U rejects,
reject.
If M accepts w, then U is guaranteed to accept. It is possible that R rejects
first. In either case, S accepts M , w. Similarly, if M does not accept w, then R
is guaranteed to accept. It is possible that U rejects first. In either case, S rejects
M , w. Therefore, S decides ATM .
This contradicts the fact that ATM is undecidable. Therefore, R cannot exist
and ATM is not recognizable.
t
u

11.5. NATURAL UNRECOGNIZABLE LANGUAGES

197

Note that ATM is not exactly the nonacceptance problem. This would be the
language NATM of strings of the form M , w where M is a Turing machine, w is
a string over the input alphabet of M and M does not accept w. However, ATM is
the union of NATM and the set of strings that are not of the form M , w where
M is a TM and w is a string over the input alphabet of M . This last language is
decidable and it is not hard to show that the union of a recognizable language
and a decidable language is always recognizable. Therefore, the nonacceptance
problem cannot be recognizable.
Corollary 11.13 The language NATM is not recognizable.
The technique that was used to prove that ATM is unrecognizable can be
generalized and used to prove other unrecognizabibility results.
Theorem 11.14 Let L be any language. If both L and L are recognizable, then L
and L are decidable.
Proof Suppose that L and L are both recognizable. Let R1 and R2 be Turing
machines that recognize L and L, respectively. We use R1 and R2 to design an
algorithm S that decides L:
1. Let w be the input string. Run R1 and R2 at the same time on w. (This can
be done by alternating between R1 and R2 , one step at a time.)
2. If R1 accepts, accept. If R1 rejects, reject. If R2 accepts, reject. If R2 rejects,
accept.
It is easy to show that S decides L. Since the class of decidable languages is
closed under complementation, we get that both L and L are decidable.
t
u
Corollary 11.15 If L is recognizable but undecidable, then L is unrecognizable.
Lets apply this to the halting problem. It is easy to see that HALTTM is recognizable:

198

CHAPTER 11. UNDECIDABILITY

1. Verify that the input string is of the form M , w where M is a Turing machine and w is a string over the input alphabet of M . If not, reject.
2. Simulate M on w.
3. If M halts, accept.
Therefore, by the corollary, HALTTM is unrecognizable. Note that, as was the
case for the acceptance problem, HALTTM is not exactly the non-halting problem,
which is the language NHALTTM of strings of the form M , w where M is a Turing
machine, w is a string over the input alphabet of M and M does not halt on w.
But the same argument we used earlier shows that if NHALTTM was recognizable,
then HALTTM would be too. This gives us the following:
Theorem 11.16 The languages HALTTM and NHALTTM are not recognizable.
Lets now consider the emptiness problem. Let NETM be the language of
strings of the form M where M is a Turing machine and L(M ) 6= ;.
Theorem 11.17 The language NETM is recognizable.
Proof To design a Turing machine that recognizes NETM , we will follow this
general strategy: given a Turing machine M , search for a string w that is accepted
by M . But we need to do this carefully, to avoid getting stuck simulating M on
any of the input strings.
Heres a Turing machine S for NETM :
1. Verify that the input string is of the form M where M is a Turing machine.
If not, reject.
2. Let t = 1.
3. Simulate M on the first t input strings for t steps each.

11.5. NATURAL UNRECOGNIZABLE LANGUAGES

199

4. If M accepts any string, accept. Otherwise, add 1 to t.


5. Repeat Steps 3 and 4 (forever).
Clearly, if S accepts M , then L(M ) 6= ;. On the other hand, if L(M ) 6= ;,
then S will eventually simulate M on some string in L(M ) for a number of steps
that is large enough for M to accept the string. Which implies that S accepts
M . Therefore, S recognizes NETM .
t
u
This implies that NETM is unrecognizable. This language is the union of ETM
and the set of strings that are not of the form M where M is a Turing machine.
Therefore,
Corollary 11.18 The language ETM is not recognizable.

Exercises
11.5.1. Show that the union of a recognizable language and a decidable language is always recognizable. Show that the same holds for intersection.
11.5.2. Let NACCEPTS"TM be the language of strings of the form M where
M is a Turing machine that does not accept the empty string. Show that
NACCEPTS"TM is not recognizable.
11.5.3. Show that the class of recognizable languages is closed under union,
intersection, concatenation and the star operation.

200

CHAPTER 11. UNDECIDABILITY

Index
", 24
" transitions, 43
alphabet, 17
ambiguous grammar, 132
Chomsky Normal Form, 156
Church-Turing Thesis, 165
computation tree, 55
concatenation, 36
context-free grammar, 123, 126
context-free language, 126
deterministic, 157
DCFL, 157
DFA, 17
DPDA, 157

inherent ambiguous language, 132


Kleene closure, 75
language, 19
leftmost derivation, 158
model of computation, 5
neutral input symbol, 24
nondeterministic finite automaton, 41
parse tree, 130
PDA, 154
Pumping Lemma, 117
pushdown automaton, 154
deterministic, 157

reduction, 185
empty string, 21, 24
extension, of a set of states in an NFA, regular expression, 82
regular expressions, 79
62
regular language, 19
finite automaton, 5, 9
star, 75
formal definition, 16
string, 18
formal description, 18

202
Turing machine, 7, 159
formal definition, 162
formal description, 165
high-level descriptions, 173
implementation-level description,
161
universal, 196

INDEX

You might also like