Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 9385ab1

Browse files
committed
Updated chapter 11 text for clarity. Updated two Ch11-related code files because the regex did not match the chapter text/explanation.
1 parent a365114 commit 9385ab1

File tree

3 files changed

+9
-8
lines changed

3 files changed

+9
-8
lines changed

book3/11-regex.mkd

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,7 @@ in.
421421

422422
While this worked, it actually results in pretty brittle code that is
423423
assuming the lines are nicely formatted. If you were to add enough error
424-
checking (or a big try/except block) to insure that your program never
424+
checking (or a big try/except block) to ensure that your program never
425425
failed when presented with incorrectly formatted lines, the code would
426426
balloon to 10-15 lines of code that was pretty hard to read.
427427

@@ -465,10 +465,11 @@ When the program runs, it produces the following output:
465465
Escape character
466466
----------------
467467

468-
Since we use special characters in regular expressions to match the
469-
beginning or end of a line or specify wild cards, we need a way to
470-
indicate that these characters are "normal" and we want to match the
471-
actual character such as a dollar sign or caret.
468+
Regular expressions utilize special characters like `^` to match the
469+
beginning of a line, `$` for the end of a line, and `.` as a wildcard;
470+
however, sometimes we want to match those characters literally. We
471+
need a way to indicate that we want to match the actual character such
472+
as a caret symbol, dollar sign, or period.
472473

473474
We can indicate that we want to simply match a character by prefixing
474475
that character with a backslash. For example, we can find money amounts
@@ -483,7 +484,7 @@ y = re.findall('\$[0-9.]+',x)
483484
Since we prefix the dollar sign with a backslash, it actually matches
484485
the dollar sign in the input string instead of matching the "end of
485486
line", and the rest of the regular expression matches one or more digits
486-
or the period character. *Note:* Inside square brackets,
487+
or the period character. Remember, as we saw above, inside square brackets,
487488
characters are not "special". So when we say `[0-9.]`, it really means
488489
digits or a period. Outside of square brackets, a period is the
489490
"wild-card" character and matches any character. Inside square brackets,

code3/re10.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,5 @@
66
hand = open('mbox-short.txt')
77
for line in hand:
88
line = line.rstrip()
9-
if re.search(r'^X\S*: [0-9.]+', line):
9+
if re.search(r'^X-.*: [0-9.]+', line):
1010
print(line)

code3/re11.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,6 @@
66
hand = open('mbox-short.txt')
77
for line in hand:
88
line = line.rstrip()
9-
x = re.findall(r'^X\S*: ([0-9.]+)', line)
9+
x = re.findall(r'^X-.*: ([0-9.]+)', line)
1010
if len(x) > 0:
1111
print(x)

0 commit comments

Comments
 (0)