Regular
Expression
Lecturer: Jay A. Abaleta
Introduction to
Regular Expressions
A regular expression (regex or RE) is a
sequence of characters that defines a
search pattern.
In automata theory, regular expressions
are used to describe regular languages,
which can be recognized by finite
automata (DFA/NFA).
What is It is also used to match character
combinations in strings. String
Regular searching algorithm used this pattern to
find the operations on string.
Expression?
Relationship
Between Regular Languages: Regular
expressions describe regular
Automata languages, which can be recognized
by deterministic finite automata
and Regular (DFA) and non-deterministic finite
automata (NFA).
Expressions
Equivalence: Every regular
expression has an equivalent finite
automaton and vice versa. They are
two different representations of the
same set of regular languages.
Components of
Regular
Expressions
Basic Symbols
∅ (empty set): Denotes an empty language, i.e.,
a language with no strings.
ε (epsilon): Represents the empty string, i.e., a
string with no characters.
Alphabet symbols: Any individual symbol from
the alphabet (e.g., a, b, 0, 1).
Operators
Union ( + or ∨ ): Combines two regular expressions,
meaning strings can match either expression.
Example: a+ba + ba+b matches either "a" or "b".
Concatenation: Combines two expressions such that
the resulting string is the concatenation of the two.
Example: ababab matches the string "ab".
Kleene Star ( * ): Denotes zero or more repetitions of
the preceding expression.
Example: a∗a^*a∗ matches "", "a", "aa", "aaa", etc.
Precedence of
Operators:
1. Kleene closure: If L is a regular
language, then its kleene closure L1* will
also be a regular language.
Example:
L* = Zero or more occurrence of language
L.
Precedence of
Operators:
2. Concatenation
The concatenation of two regular languages, L1 and
L2, which are represented using L1.L2 is also regular,
and which represents the set of strings that are
formed by taking any string in L1 concatenating it
with any string in L2.
• Example:
• L1 = { 0,1 } and L2 = { 00, 11} then L1.L2 = {000,
011, 100, 111}.
Precedence of
Operators:
3. Union
The union of two regular languages, L1 and L2, which
are represented using L1 ∪ L2, is also regular and which
represents the set of strings that are either in L1 or L2
or both.
Example:
L1 = (1+0).(1+0) = {00 , 10, 11, 01} and
L2 = {∈ , 100}
then L1 ∪ L2 = {∈, 00, 10, 11, 01, 100}.
Understanding the L1
The expression for L1=(1+0).(1+0) describes a concatenation of two sets.
1 + 0 means choice or union of either the symbol 1 or 0.
When you concatenate two sets like this, you take every possible string
formed by concatenating one symbol from the first set with one symbol
from the second set.
1+0={1,0}, which means the set contains both 1 and 0.
When you concatenate these two sets, {1,0} and {1,0}, the result is:
L1={00,10,11,01}
Understanding L2
L2 is explicitly defined as:
L2={ϵ,100}
Where:
ϵ represents the empty string, i.e., a string with
no characters.
"100" is just the binary string "100".
Union of L1 and L2
The union (∪) of two languages means combining all the
elements from both sets, without duplication. So, when we take
the union of L1 and L2, we combine all their elements:
L1∪L2={00,10,11,01}∪{ϵ,100}
The result of this union is:
L1∪L2={ϵ,00,10,11,01,100}
Regular Expression
Examples:
• Example 1: Single symbol
Expression: a
Language: { "a" }
• Example 2: Union
Expression: a + b
Language: { "a", "b" }
Regular Expression
Examples:
• Example 3: Concatenation
Expression: ab
Language: { "ab" }
• Example 4: Kleene Star
Expression: a*
Language: { "", "a", "aa", "aaa", ... }
Regular Expression
Examples:
• Example 5: Complex Expression
Expression: a(b + c)*
Language: { "a", "ab", "ac", "abb", "acc", "abc", ... }
Where: The string starts with an "a" followed by any
number (including zero) of "b"s and "c"s in any
combination.
Equivalences in regular expressions that help
simplify complex expressions:
r+r = r (idempotent law)
r⋅ε = r and ε⋅r = r (identity for concatenation)
Regular
r⋅∅ = ∅ and ∅⋅r = ∅ (zero element for
Expression concatenation)
Equivalenc
r∗ = ε+rr∗ (unfolding definition of Kleene star)
es
1. Language of all
strings containing
“ab”:
• Regular expression: .*ab.*
Building • Where: The expression
Regular describes any string
Expressions (denoted by .*), with "ab"
appearing anywhere in
for Simple the string.
Languages
2. Language of all
strings ending
with “01”:
Building
Regular • Regular
Expressions expression: .*01
• Where: Any string
for Simple
(denoted by .*),
Languages followed by "01".
3. Language of strings
with even number of
0’s:
• Regular expression:
Building (1*01*0)*1*
• Where: The language
Regular describes strings with pairs
Expressions of "0"s, and any number of
for Simple "1"s can occur between or
around them.
Languages