WORKING WITH REGULAR
EXPRESSIONS
PROF. MARY GRACE G. VENTURA
Working with Regular
2
Expressions
Regular Expressions are patterns that are
used for matching and manipulating strings
according to specified rules
PHP supports two types of regular expressions:
POSIX Extended
Perl Compatible Regular Expressions
PHP Programming with MySQL, 2nd Edition
Working with Regular Expressions
3
(continued)
PHP Programming with MySQL, 2nd Edition
Working with Regular Expressions
4
(continued)PCRE
Pass to the preg_match() the regular
expression pattern as the first argument and a
string containing the text you want to search
as the second argument
preg_match(pattern, string);
PHP Programming with MySQL, 2nd Edition
Writing Regular Expression
5 Patterns
A regular expression pattern is a special text
string that describes a search pattern
Regular expression patterns consist of literal
characters and metacharacters, which are
special characters that define the pattern-
matching rules
Regular expression patterns are enclosed in
opening and closing delimiters
Themost common character delimiter is the
forward slash (/)
PHP Programming with MySQL, 2nd Edition
Writing Regular Expression Patterns
6
(continued)
PHP Programming with MySQL, 2nd Edition
Matching Any Character
7
A period (.) in a regular expression pattern
specifies that the pattern must contain a value at
the location of the period
A return value of 0 indicates that the string does
not match the pattern and 1 if it does
$ZIP = "015";
preg_match("/...../", $ZIP); //
$ZIP = "01562";
preg_match("/...../", $ZIP); //
PHP Programming with MySQL, 2nd Edition
Matching Characters at the
8
Beginning or End of a String
An anchor specifies that the pattern must appear
at a particular position in a string
The ^ metacharacter anchors characters to the
beginning of a string
The $ metacharacter anchors characters to the
end of a string
$URL = "http://www.dongosselin.com";
preg_match("/^http/", $URL); //
PHP Programming with MySQL, 2nd Edition
Matching Characters at the Beginning or
9
End of a String (continued)
To specify an anchor at the beginning of a string,
the pattern must begin with a ^ metcharacter
$URL = "http://www.dongosselin.com";
eregi("^http", $URL); //
To specify an anchor at the end of a line, the
pattern must end with the $ metacharacter
$Identifier = "http://www.dongosselin.com";
eregi("com$", $Identifier); //
PHP Programming with MySQL, 2nd Edition
Matching Special Characters
10
To match any metacharacters as literal values
in a regular expression, escape the character
with a backslash
(in the following example, the last four characters in the
string must be ‘.com’)
$Identifier = http://www.dongosselin.com";
preg_match("/gov$/", $Identifier);//
PHP Programming with MySQL, 2nd Edition
Indicating Start and End of
Strings
indicates start of string
$ indicates end of string
Expression Meaning
“The” Matches any string that starts with “The”
“of despair$” Matches a string that ends in the substring “of depair
“abc$” A string that starts and ends with “abc” – that could only
be “abc” itself!
“notice” A string that has the text “notice” in it.
Specifying Quantity
12
Metacharacters that specify the quantity of a
match are called quantifiers
PHP Programming with MySQL, 2nd Edition
Specifying Quantity
13
(continued)
A question mark (?) quantifier specifies that
the preceding character in the pattern is
optional
(in the following example, the string must begin with
‘http’ or ‘https’)
$URL = "http://www.dongosselin.com";
preg_match("/^https?/", $URL); //
PHP Programming with MySQL, 2nd Edition
Specifying Quantity
14
(continued)
The addition(+) quantifier specifies that one
or more sequential occurrences of the
preceding characters match
(in the following example, the string must have at least
one character)
$Name = "Don";
preg_match("/.+/", $Name); //
PHP Programming with MySQL, 2nd Edition
Specifying Quantity
15
(continued)
A asterisk (*) quantifier specifies that zero or
more sequential occurrences of the preceding
characters match
(in the following example, the string must begin with one or
more leading zeros)
NumberString = "00125";
preg_match("/^0*/", $NumberString);//
PHP Programming with MySQL, 2nd Edition
Specifying Quantity
16
(continued)
The { } quantifiers specify the number of times that a
character must repeat sequentially
(in the following example, the string must contain at least five
characters)
preg_match("/ZIP: .{5}$/", " ZIP: 01562");
//
The { } quantifiers can also specify the quantity as a range
(in the following example, the string must contain between
five and ten characters)
preg_match("/(ZIP: .{5,10})$/", "ZIP:
01562-2607");//
PHP Programming with MySQL, 2nd Edition
Ranged or Bounded Repetition
You can also use bounds, which come inside
braces and Indicate ranges in the number of
occurences
Expression Meaning
“ab{2}” Matches a string that has an a followed by exactly two
b’s (“abb”)
“ab{2,}” Matches a string that has an a followed by at least two
b’s (“abb”, “abbbb”, etc.)
“ab{3, 5}” Matches a string that has an a followed by three to five
b’s (“abbb”, “abbbb”, or “abbbbb”)
Specifying Subexpressions
18
When a set of characters enclosed in
parentheses are treated as a group, they are
referred to as a subexpression or subpattern
(in the example below, the 1 and the area code are
optional, but if included must be in the following format:)
1 (707) 555-1234
preg_match("/^(1 )?(\(.{3}\)
)?(.{3})(\.{4})$/
PHP Programming with MySQL, 2nd Edition
Subexpressions
Subexpressions can be enclosed in
parentheses
Expression Meaning
“a (bc) ” Matches a string that has an a followed by zero or
more copies of the sequence “bc” “a”, “abc”,
“abcbc”, “abcbcbc”, etc.
“a (bc) {1, 5}” Matches a string that has an a followed by one
through five copies of “bc” “abc”, abcbcbc”,
“abcbcbcbc”, “abcbcbcbcbc”
Defining Character Classes
20
Character classes in regular expressions treat
multiple characters as a single item
Characters enclosed with the ([])
metacharacters represent alternate characters
that are allowed in a pattern match
preg_match("/analy[sz]e/", "analyse");//
preg_match("/analy[sz]e/", "analyze");//
preg_match("/analy[sz]e/", "analyce");//
PHP Programming with MySQL, 2nd Edition
Defining Character Classes
21
(continued)
The hyphen metacharacter (-) specifies a
range of values in a character class
(the following example ensures that A, B, C, D, or F are
the only values assigned to the $LetterGrade variable)
$LetterGrade = “A";
echo ereg("[A-DF]", $LetterGrade);
PHP Programming with MySQL, 2nd Edition
Defining Character Classes
22
(continued)
The ^ metacharacter (placed immediately
after the opening bracket of a character class)
specifies optional characters to exclude in a
pattern match
(the following example excludes the letter E and G-Z from
an acceptable pattern match in the $LetterGrade
variable)
$LetterGrade = "A";
echo ereg("[^EG-Z]", $LetterGrade); //
returns true
PHP Programming with MySQL, 2nd Edition
Defining Character Classes
23
(continued)
PHP Programming with MySQL, 2nd Edition
Matching Multiple Pattern
24 Choices
The | metacharacter is used to specify an
alternate set of patterns
The | metacharacter is essentially the same as
using the OR operator to perform multiple
evaluations in a conditional expression
orange|apple
PHP Programming with MySQL, 2nd Edition
Pattern Modifiers
25
Pattern modifiers are letters placed after the
closing delimiter that change the default rules
for interpreting matches
The pattern modifier, i, indicates that the case of
the letter does not matter when searching
The pattern modifier, m, allows searches across
newline characters
The pattern modifier, s, changes how the . (period)
metacharacter works
The pattern modifier, o, evaluates the expression
only once.
PHP Programming with MySQL, 2nd Edition
Predefined Character Ranges
Also known as character classes. Character
classes specify an entire range of characters,
for example, the alphabet or an integer set −
EXPRESSION DESCRIPTION
[[:alpha:]] It matches any string containing alphabetic
characters aA through zZ.
[[:digit:]] It matches any string containing numerical
digits 0 through 9.
[[:alnum:]] It matches any string containing alphanumeric
characters aA through zZ and 0 through 9
[[:space:]] It matches any string containing a space.