@@ -113,8 +113,9 @@ \subsection{Regular Expression Syntax}
113113sequence isn't recognized by Python's parser, the backslash and
114114subsequent character are included in the resulting string. However,
115115if Python would recognize the resulting sequence, the backslash should
116- be repeated twice. This is complicated and hard to understand, so
117- it's highly recommended that you use raw strings for all but the simplest expressions.
116+ be repeated twice. This is complicated and hard to understand, so
117+ it's highly recommended that you use raw strings for all but the
118+ simplest expressions.
118119%
119120\item [\code {[]}] Used to indicate a set of characters. Characters can
120121be listed individually, or a range of characters can be indicated by
@@ -149,12 +150,13 @@ \subsection{Regular Expression Syntax}
149150determines what the meaning and further syntax of the construct is.
150151Following are the currently supported extensions.
151152%
152- \item [\code {(?iLmsx)}] (One or more letters from the set 'i' , 'L' , 'm' , 's' ,
153- 'x' .) The group matches the empty string; the letters set the
154- corresponding flags (re.I, re.L, re.M, re.S, re.X) for the entire regular
155- expression. This is useful if you wish include the flags as part of
156- the regular expression, instead of passing a \var {flag} argument to
157- the \code {compile} function.
153+ \item [\code {(?iLmsx)}] (One or more letters from the set '\code{i}' ,
154+ '\code{L}' , '\code{m}' , '\code{s}' , '\code{x}' .) The group matches
155+ the empty string; the letters set the corresponding flags
156+ (\code {re.I}, \code {re.L}, \code {re.M}, \code {re.S}, \code {re.X}) for
157+ the entire regular expression. This is useful if you wish include the
158+ flags as part of the regular expression, instead of passing a
159+ \var {flag} argument to the \code {compile()} function.
158160%
159161\item [\code {(?:...)}] A non-grouping version of regular parentheses.
160162Matches whatever's inside the parentheses, but the text matched by the
@@ -171,19 +173,24 @@ \subsection{Regular Expression Syntax}
171173For example, if the pattern is
172174\code {(?P<id>[a-zA-Z_]\e w*)}, the group can be referenced by its
173175name in arguments to methods of match objects, such as \code {m.group('id')}
174- or \code {m.end('id')}, and also by name in pattern text (e.g. \code {(?P=id)}) and
175- replacement text (e.g. \code {\e g<id>}).
176+ or \code {m.end('id')}, and also by name in pattern text
177+ (e.g. \code {(?P=id)}) and replacement text (e.g. \code {\e g<id>}).
176178%
177- \item [\code {(?P=\var {name})}] Matches whatever text was matched by the earlier group named \var {name}.
179+ \item [\code {(?P=\var {name})}] Matches whatever text was matched by the
180+ earlier group named \var {name}.
178181%
179- \item [\code {(?\# ...)}] A comment; the contents of the parentheses are simply ignored.
182+ \item [\code {(?\# ...)}] A comment; the contents of the parentheses are
183+ simply ignored.
180184%
181- \item [\code {(?=...)}] Matches if \code {...} matches next, but doesn't consume any of the string. This is called a lookahead assertion. For example,
182- \code {Isaac (?=Asimov)} will match 'Isaac~' only if it's followed by 'Asimov' .
185+ \item [\code {(?=...)}] Matches if \code {...} matches next, but doesn't
186+ consume any of the string. This is called a lookahead assertion. For
187+ example, \code {Isaac (?=Asimov)} will match 'Isaac~' only if it's
188+ followed by 'Asimov' .
183189%
184- \item [\code {(?!...)}] Matches if \code {...} doesn't match next. This is a negative lookahead assertion. For example,
185- For example,
186- \code {Isaac (?!Asimov)} will match 'Isaac~' only if it's \emph {not } followed by 'Asimov' .
190+ \item [\code {(?!...)}] Matches if \code {...} doesn't match next. This
191+ is a negative lookahead assertion. For example,
192+ \code {Isaac (?!Asimov)} will match 'Isaac~' only if it's \emph {not }
193+ followed by 'Asimov' .
187194
188195\end {itemize }
189196
@@ -227,15 +234,16 @@ \subsection{Regular Expression Syntax}
227234\item [\code {\e S}]Matches any non-whitespace character; this is
228235equivalent to the set \code {[\^ \e t\e n\e r\e f\e v]}.
229236%
230- \item [\code {\e w}]When the LOCALE flag is not specified, matches any alphanumeric character; this is
231- equivalent to the set \code {[a-zA-Z0-9_]}. With LOCALE, it will match
232- the set \code {[0-9_]} plus whatever characters are defined as letters
233- for the current locale.
237+ \item [\code {\e w}]When the \code {LOCALE} flag is not specified,
238+ matches any alphanumeric character; this is equivalent to the set
239+ \code {[a-zA-Z0-9_]}. With \code {LOCALE}, it will match the set
240+ \code {[0-9_]} plus whatever characters are defined as letters for the
241+ current locale.
234242%
235- \item [\code {\e W}]When the LOCALE flag is not specified, matches any
236- non-alphanumeric character; this is equivalent to the set
237- \code {[{\^ }a-zA-Z0-9_]}. With LOCALE, it will match any character
238- not in the set \code {[0-9_]}, and not defined as a letter
243+ \item [\code {\e W}]When the \code { LOCALE} flag is not specified,
244+ matches any non-alphanumeric character; this is equivalent to the set
245+ \code {[{\^ }a-zA-Z0-9_]}. With \code { LOCALE} , it will match any
246+ character not in the set \code {[0-9_]}, and not defined as a letter
239247for the current locale.
240248
241249\item [\code {\e Z}]Matches only at the end of the string.
@@ -254,8 +262,8 @@ \subsection{Module Contents}
254262
255263\begin {funcdesc }{compile}{pattern\optional {\, flags}}
256264 Compile a regular expression pattern into a regular expression
257- object, which can be used for matching using its \code {match} and
258- \code {search} methods, described below.
265+ object, which can be used for matching using its \code {match() } and
266+ \code {search() } methods, described below.
259267
260268 The expression's behaviour can be modified by specifying a
261269 \var {flags} value. Values can be any of the following variables,
@@ -266,34 +274,34 @@ \subsection{Module Contents}
266274% The use of \quad in the item labels is ugly but adds enough space
267275% to the label that it doesn't get visually run-in with the text.
268276
269- \item [I or IGNORECASE or \code {(?i)}\quad ]
277+ \item [\code {I} or \code { IGNORECASE} or \code {(?i)}\quad ]
270278
271279Perform case-insensitive matching; expressions like \code {[A-Z]} will match
272280lowercase letters, too. This is not affected by the current locale.
273281
274- \item [L or LOCALE or \code {(?L)}\quad ]
282+ \item [\code {L} or \code { LOCALE} or \code {(?L)}\quad ]
275283
276284Make \code {\e w}, \code {\e W}, \code {\e b},
277285\code {\e B}, dependent on the current locale.
278286
279- \item [M or MULTILINE or \code {(?m)}\quad ]
287+ \item [\code {M} or \code { MULTILINE} or \code {(?m)}\quad ]
280288
281289When specified, the pattern character \code {\^ } matches at the
282- beginning of the string and at the beginning of each line
283- (immediately following each newline); and the pattern character
290+ beginning of the string and at the beginning of each line
291+ (immediately following each newline); and the pattern character
284292\code {\$ } matches at the end of the string and at the end of each line
285293(immediately preceding each newline).
286294By default, \code {\^ } matches only at the beginning of the string, and
287295\code {\$ } only at the end of the string and immediately before the
288296newline (if any) at the end of the string.
289297
290- \item [S or DOTALL or \code {(?s)}\quad ]
298+ \item [\code {S} or \code { DOTALL} or \code {(?s)}\quad ]
291299
292300Make the \code {.} special character any character at all, including a
293301newline; without this flag, \code {.} will match anything \emph {except }
294302a newline.
295303
296- \item [X or VERBOSE or \code {(?x)}\quad ]
304+ \item [\code {X} or \code { VERBOSE} or \code {(?x)}\quad ]
297305
298306Ignore whitespace within the pattern
299307except when in a character class or preceded by an unescaped
@@ -311,11 +319,11 @@ \subsection{Module Contents}
311319\end {verbatim }\ecode
312320%
313321is equivalent to
314- %
315- \bcode \ begin {verbatim }
322+
323+ \begin {verbatim }
316324result = re.match(pat, str)
317- \end {verbatim }\ecode
318- %
325+ \end {verbatim }
326+
319327but the version using \code {compile()} is more efficient when the
320328expression will be used several times in a single program.
321329% (The compiled version of the last pattern passed to \code{regex.match()} or
@@ -340,7 +348,8 @@ \subsection{Module Contents}
340348
341349\begin {funcdesc }{search}{pattern\, string\optional {\, flags}}
342350 Scan through \var {string} looking for a location where the regular
343- expression \var {pattern} produces a match, and return a corresponding \code {MatchObject} instance.
351+ expression \var {pattern} produces a match, and return a
352+ corresponding \code {MatchObject} instance.
344353 Return \code {None} if no
345354 position in the string matches the pattern; note that this is
346355 different from finding a zero-length match at some point in the string.
@@ -390,11 +399,11 @@ \subsection{Module Contents}
390399regex object; if you need to specify
391400regular expression flags, you must use a regex object, or use
392401embedded modifiers in a pattern; e.g.
393- %
394- \bcode \ begin {verbatim }
402+
403+ \begin {verbatim }
395404sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'.
396- \end {verbatim }\ecode
397- %
405+ \end {verbatim }
406+
398407The optional argument \var {count} is the maximum number of pattern
399408occurrences to be replaced; count must be a non-negative integer, and
400409the default value of 0 means to replace all occurrences.
@@ -405,7 +414,7 @@ \subsection{Module Contents}
405414
406415\begin {funcdesc }{subn}{pattern\, repl\, string\optional {, count=0}}
407416Perform the same operation as \code {sub()}, but return a tuple
408- \code {(new_string, number_of_subs_made)}.
417+ \code {(\var { new_string}, \var { number_of_subs_made} )}.
409418\end {funcdesc }
410419
411420\begin {excdesc }{error}
@@ -445,19 +454,19 @@ \subsection{Regular Expression Objects}
445454 different from finding a zero-length match at some point in the string.
446455
447456 The optional \var {pos} and \var {endpos} parameters have the same
448- meaning as for the \code {match} method.
457+ meaning as for the \code {match() } method.
449458\end {funcdesc }
450459
451460\begin {funcdesc }{split}{string\, \optional {, maxsplit=0}}
452- Identical to the \code {split} function, using the compiled pattern.
461+ Identical to the \code {split() } function, using the compiled pattern.
453462\end {funcdesc }
454463
455464\begin {funcdesc }{sub}{repl\, string\optional {, count=0}}
456- Identical to the \code {sub} function, using the compiled pattern.
465+ Identical to the \code {sub() } function, using the compiled pattern.
457466\end {funcdesc }
458467
459468\begin {funcdesc }{subn}{repl\, string\optional {, count=0}}
460- Identical to the \code {subn} function, using the compiled pattern.
469+ Identical to the \code {subn() } function, using the compiled pattern.
461470\end {funcdesc }
462471
463472\renewcommand {\indexsubitem }{(regex attribute)}
@@ -477,8 +486,9 @@ \subsection{Regular Expression Objects}
477486The pattern string from which the regex object was compiled.
478487\end {datadesc }
479488
480- \subsection {MatchObjects }
481- \code {Matchobject} instances support the following methods and attributes:
489+ \subsection {Match Objects }
490+
491+ \code {MatchObject} instances support the following methods and attributes:
482492
483493\begin {funcdesc }{group}{\optional {g1, g2, ...}}
484494Returns one or more groups of the match. If there is a single
@@ -495,12 +505,13 @@ \subsection{MatchObjects}
495505their group name.
496506
497507A moderately complicated example:
498- \bcode \begin {verbatim }
508+
509+ \begin {verbatim }
499510m = re.match(r"(?P<int>\d+)\.(\d*)", '3.14')
500- \end {verbatim }\ecode
501- %
502- After performing this match, \code {m.group(1)} is \code {'3'}, as is \code {m.group('int')}.
503- \code {m.group(2)} is \code {'14'}.
511+ \end {verbatim }
512+
513+ After performing this match, \code {m.group(1)} is \code {'3'}, as is
514+ \code {m.group('int')}. \code {m.group( 2)} is \code {'14'}.
504515\end {funcdesc }
505516
506517\begin {funcdesc }{groups}{}
@@ -519,37 +530,41 @@ \subsection{MatchObjects}
519530Return the indices of the start and end of the substring
520531matched by \var {group}. Return \code {None} if \var {group} exists but
521532did not contribute to the match. For a match object
522- \code {m}, and a group \code {g} that did contribute to the match, the
523- substring matched by group \code {g} (equivalent to \code {m.group(g)}) is
524- \bcode \begin {verbatim }
525- m.string[m.start(g):m.end(g)]
526- \end {verbatim }\ecode
527- %
533+ \var {m}, and a group \var {g} that did contribute to the match, the
534+ substring matched by group \var {g} (equivalent to
535+ \code {\var {m}.group(\var {g})}) is
536+
537+ \begin {verbatim }
538+ m.string[m.start(g):m.end(g)]
539+ \end {verbatim }
540+
528541Note that
529542\code {m.start(\var {group})} will equal \code {m.end(\var {group})} if
530- \var {group} matched a null string. For example, after \code {m =
531- re.search('b(c?)', 'cba' )}, \code {m.start(0)} is 1, \code {m.end(0)} is
532- 2, \code {m.start(1)} and \code {m.end(1)} are both 2, and
533- \code {m.start(2)} raises an \code {IndexError} exception.
543+ \var {group} matched a null string. For example, after \code {\var {m} =
544+ re.search('b(c?)', 'cba' )}, \code {\var {m}.start(0)} is 1,
545+ \code {\var {m}.end(0)} is 2, \code {\var {m}.start(1)} and
546+ \code {\var {m}.end(1)} are both 2, and \code {\var {m}.start(2)} raises
547+ an \code {IndexError} exception.
534548
535549\end {funcdesc }
536550
537551\begin {funcdesc }{span}{group}
538- Return the 2-tuple \code {(start(\var {group}), end(\var {group}))}.
552+ For \code {MatchObject} \var {m}, return the 2-tuple
553+ \code {(\var {m}.start(\var {group}), \var {m}.end(\var {group}))}.
539554Note that if \var {group} did not contribute to the match, this is
540555\code {(None, None)}.
541556\end {funcdesc }
542557
543558\begin {datadesc }{pos}
544559The value of \var {pos} which was passed to the
545- \code {search} or \code {match} function. This is the index into the
546- string at which the regex engine started looking for a match.
560+ \code {search() } or \code {match() } function. This is the index into
561+ the string at which the regex engine started looking for a match.
547562\end {datadesc }
548563
549564\begin {datadesc }{endpos}
550565The value of \var {endpos} which was passed to the
551- \code {search} or \code {match} function. This is the index into the
552- string beyond which the regex engine will not go.
566+ \code {search() } or \code {match() } function. This is the index into
567+ the string beyond which the regex engine will not go.
553568\end {datadesc }
554569
555570\begin {datadesc }{re}
@@ -563,9 +578,7 @@ \subsection{MatchObjects}
563578
564579\begin {seealso }
565580\seetext {Jeffrey Friedl, \emph {Mastering Regular Expressions },
566- O'Reilly. The Python material in this book dates from before the re
567- module, but it covers writing good regular expression patterns in
568- great detail.}
581+ O'Reilly. The Python material in this book dates from before the
582+ \code {re} module, but it covers writing good regular expression
583+ patterns in great detail.}
569584\end {seealso }
570-
571-
0 commit comments