LECTURE 4
Chapter 4
Regular Expressions
IMPORTANT TERMS
Regular Expressions
Regular Languages
Finite Representations
RECURSIVE DEFINITION OF REGULAR
EXPRESSIONS
Rule 1: Every letter of ∑ can be made into a
regular expression by writing it in boldface; ^ is a
regular expression.
Rule 2 :If r1 and r2 are regular expressions, then so
are:
(rl)
r1r2
r1 + r 2
r l*
Rule 3 :Nothing else is a regular expression.
IMPORTANT DEFINITIONS
S + T = {w : w € S or w € T} (Union)
ST = {w = w1w2 : w1 € S,w2 € T} (Concatenation
or Product)
S* = S0 + S1 + S2 + · · · (Kleene’s Closure)
S+ = S1 + S2 + · · · (Positive Closure)
EXAMPLE
Suppose that we wished to describe the language
L over the alphabet ∑ = {a,b} where L = {a ab
abb abbb abbbb ... }
R.E= ab*
(ab)* = ^ or ab or abab or ababab ...
XX* = X+
XX* X+ XX*X* X*XX* x x* x*x+ x**x*xx*
EXAMPLE
ab*a
language (ab*a) = {aa aba abba abbba abbbba
... .}
EXAMPLE
The language of the expression a*b*
contains all the strings of a's and b's in which all
the a's (if any) come before all the b's (if any).
language (a*b*) = {^ a b aa ab bb aaa aab
abb bbb aaaa . . . }
Notice that ba and aba are not in this language.
Notice also that there need not be the same
number of a's and b's.
Here we should again be very careful to observe
that
a*b* ≠ (ab)*
EXAMPLE
Example: ∑ = {a, b}
language L of all words starting and ending with
b
L = {b, bb, bab, bbb, baab, babb, bbab, bbbb, . . .}
= language(b + b(a + b)*b)
EXAMPLE
∑= {a, b}
language L of all words with exactly two b’s
L = language(a*ba*ba*)
EXAMPLE:
∑ = {a, b}
language L of all words with at least two b’s
L = language((a + b)*b(a + b)*b(a + b)*)
Note that bbaaba E L since
bbaaba = (^)b(^)b(aaba) = (b)b(aa)b(a)
EXAMPLE:
language
L = {aba, abba, bbaab}
Then a regular expression to define L is
aba + abba + bbaab
EXAMPLE
The following expressions both define the
language L2 = {xodd}
x(xx)* or (xx)*x
but the expression x*xx* does not since it
includes the word (xx) x (x).
ANOTHER USE PLUS (+) SIGN
By the expression x + y where x and y are strings
of characters from an alphabet, we mean "either
x or y".
EXAMPLE
Consider the language T defined over the
alphabet ∑ {a, b, c}
T = {a c ab cb abb cbb abbb cbbb abbbb
cbbbb ... }
All the words in T begin with an a or a c and then
are followed by some number of b's. Symbolically,
we may write this as
T = language ((a + c)b*)= language (either a
or c then some b's)
EXAMPLE
Now let us consider a finite language L that
contains all the strings of a's and b's of length
exactly three.
L = {aaa aab aba abb baa bab bba bbb}
The first letter of each word in L is either an a or
a b. The second letter of each word in L is either
an a or a b. The third letter of each word in L is
either an a or a b. So we may write
L = language ((a + b)(a + b)(a + b))
or for short, L = language ((a + b)3) •
EXAMPLE CONT…
If we want to define the set of all seven letter
strings of a's and b's, we could write (a + b)7 . In
general, if we want to refer to the set of all
possible strings of a's and b's of any length
whatsoever we could write,
(a + b)*
This is the set of all possible strings of letters
from the alphabet ∑ = {a, b}
(a+b)*= {^ a b ab ba aa bb aba abb…}
EVEN-EVEN LANGUAGE
EVEN-EVEN = {^ aa bb aabb abab abba
baab baba bbaa aaaabb aaabab ... }
E = [aa + bb + (ab+ba)(aa+bb)*(ab+ba)]*
Type 1 = aa
Type 2 = bb
Type 3 = (ab + ba)(aa + bb)*(ab + ba)
E = [type1 + type2 + type 3]*
EXAMPLE
Regular expression r aa * bb * b
Lr {a b 2n 2m
b : n, m 0}