11\section {Built-in Module \sectcode {regex} }
22\label {module-regex }
3-
43\bimodindex {regex}
4+
55This module provides regular expression matching operations similar to
66those found in Emacs.
77
88\strong {Obsolescence note:}
99This module is obsolete as of Python version 1.5; it is still being
1010maintained because much existing code still uses it. All new code in
11- need of regular expressions should use the new \code {re} module, which
12- supports the more powerful and regular Perl-style regular expressions.
13- Existing code should be converted. The standard library module
14- \code {reconvert} helps in converting \code {regex} style regular
15- expressions to \code {re} style regular expressions. (For more
16- conversion help, see the URL
11+ need of regular expressions should use the new
12+ \code {re}\refstmodindex {re} module, which supports the more powerful
13+ and regular Perl-style regular expressions. Existing code should be
14+ converted. The standard library module
15+ \code {reconvert}\refstmodindex {reconvert} helps in converting
16+ \code {regex} style regular expressions to \code {re}\refstmodindex {re}
17+ style regular expressions. (For more conversion help, see the URL
1718\file {http://starship.skyport.net/crew/amk/regex/regex-to-re.html}.)
1819
1920By default the patterns are Emacs-style regular expressions
@@ -154,7 +155,8 @@ \subsection{Regular Expressions}
154155beginning or end of a word.
155156%
156157\item [\code {\e v}] Must be followed by a two digit decimal number, and
157- matches the contents of the group of the same number. The group number must be between 1 and 99, inclusive.
158+ matches the contents of the group of the same number. The group
159+ number must be between 1 and 99, inclusive.
158160%
159161\item [\code {\e w}]Matches any alphanumeric character; this is
160162equivalent to the set \code {[a-zA-Z0-9]}.
@@ -174,8 +176,8 @@ \subsection{Regular Expressions}
174176% Python they seem to be synonyms for ^$.
175177\item [\code {\e `}] Like \code {\^ }, this only matches at the start of the
176178string.
177- \item [\code {\e \e '}] Like \code {\$ }, this only matches at the end of the
178- string.
179+ \item [\code {\e \e '}] Like \code {\$ }, this only matches at the end of
180+ the string.
179181% end of buffer
180182\end {itemize }
181183
@@ -201,13 +203,13 @@ \subsection{Module Contents}
201203
202204\begin {funcdesc }{compile}{pattern\optional {\, translate}}
203205 Compile a regular expression pattern into a regular expression
204- object, which can be used for matching using its \code {match} and
205- \code {search} methods, described below. The optional argument
206+ object, which can be used for matching using its \code {match() } and
207+ \code {search() } methods, described below. The optional argument
206208 \var {translate}, if present, must be a 256-character string
207209 indicating how characters (both of the pattern and of the strings to
208- be matched) are translated before comparing them; the \code {i}-th
210+ be matched) are translated before comparing them; the \var {i}-th
209211 element of the string gives the translation for the character with
210- \ASCII {} code \code {i}. This can be used to implement
212+ \ASCII {} code \var {i}. This can be used to implement
211213 case-insensitive matching; see the \code {casefold} data item below.
212214
213215 The sequence
@@ -222,7 +224,7 @@ \subsection{Module Contents}
222224\bcode \begin {verbatim }
223225result = regex.match(pat, str)
224226\end {verbatim }\ecode
225- %
227+
226228but the version using \code {compile()} is more efficient when multiple
227229regular expressions are used concurrently in a single program. (The
228230compiled version of the last pattern passed to \code {regex.match()} or
@@ -232,24 +234,24 @@ \subsection{Module Contents}
232234\end {funcdesc }
233235
234236\begin {funcdesc }{set_syntax}{flags}
235- Set the syntax to be used by future calls to \code {compile},
236- \code {match} and \code {search}. (Already compiled expression objects
237- are not affected.) The argument is an integer which is the OR of
238- several flag bits. The return value is the previous value of
239- the syntax flags. Names for the flags are defined in the standard
240- module \code {regex_syntax}; read the file \file {regex_syntax.py} for
241- more information.
237+ Set the syntax to be used by future calls to \code {compile() },
238+ \code {match() } and \code {search() }. (Already compiled expression
239+ objects are not affected.) The argument is an integer which is the
240+ OR of several flag bits. The return value is the previous value of
241+ the syntax flags. Names for the flags are defined in the standard
242+ module \code {regex_syntax}\refstmodindex {regex_syntax}; read the
243+ file \file {regex_syntax.py} for more information.
242244\end {funcdesc }
243245
244246\begin {funcdesc }{get_syntax}{}
245247 Returns the current value of the syntax flags as an integer.
246248\end {funcdesc }
247249
248250\begin {funcdesc }{symcomp}{pattern\optional {\, translate}}
249- This is like \code {compile}, but supports symbolic group names: if a
251+ This is like \code {compile() }, but supports symbolic group names: if a
250252parenthesis-enclosed group begins with a group name in angular
251253brackets, e.g. \code {'\e (<id>[a-z][a-z0-9]*\e )'}, the group can
252- be referenced by its name in arguments to the \code {group} method of
254+ be referenced by its name in arguments to the \code {group() } method of
253255the resulting compiled regular expression object, like this:
254256\code {p.group('id')}. Group names may contain alphanumeric characters
255257and \code {'_'} only.
@@ -263,8 +265,8 @@ \subsection{Module Contents}
263265\end {excdesc }
264266
265267\begin {datadesc }{casefold}
266- A string suitable to pass as \var {translate} argument to
267- \code {compile} to map all upper case characters to their lowercase
268+ A string suitable to pass as the \var {translate} argument to
269+ \code {compile() } to map all upper case characters to their lowercase
268270equivalents.
269271\end {datadesc }
270272
@@ -278,7 +280,7 @@ \subsection{Module Contents}
278280 does not match the pattern (this is different from a zero-length
279281 match!).
280282
281- The optional second parameter \var {pos} gives an index in the string
283+ The optional second parameter, \var {pos}, gives an index in the string
282284 where the search is to start; it defaults to \code {0}. This is not
283285 completely equivalent to slicing the string; the \code {'\^ '} pattern
284286 character matches at the real begin of the string and at positions
@@ -293,12 +295,12 @@ \subsection{Module Contents}
293295 match anywhere!).
294296
295297 The optional second parameter has the same meaning as for the
296- \code {match} method.
298+ \code {match() } method.
297299\end {funcdesc }
298300
299301\begin {funcdesc }{group}{index\, index\, ...}
300- This method is only valid when the last call to the \code {match}
301- or \code {search} method found a match. It returns one or more
302+ This method is only valid when the last call to the \code {match() }
303+ or \code {search() } method found a match. It returns one or more
302304groups of the match. If there is a single \var {index} argument,
303305the result is a single string; if there are multiple arguments, the
304306result is a tuple with one item per argument. If the \var {index} is
@@ -308,8 +310,8 @@ \subsection{Module Contents}
308310groups are parenthesized using \code {{\e }(} and \code {{\e })}). If no
309311such group exists, the corresponding result is \code {None}.
310312
311- If the regular expression was compiled by \code {symcomp} instead of
312- \code {compile}, the \var {index} arguments may also be strings
313+ If the regular expression was compiled by \code {symcomp() } instead of
314+ \code {compile() }, the \var {index} arguments may also be strings
313315identifying groups by their group name.
314316\end {funcdesc }
315317
@@ -319,41 +321,41 @@ \subsection{Module Contents}
319321\renewcommand {\indexsubitem }{(regex attribute)}
320322
321323\begin {datadesc }{regs}
322- When the last call to the \code {match} or \code {search} method found a
323- match, this is a tuple of pairs of indices corresponding to the
324+ When the last call to the \code {match() } or \code {search() } method found a
325+ match, this is a tuple of pairs of indexes corresponding to the
324326beginning and end of all parenthesized groups in the pattern. Indices
325- are relative to the string argument passed to \code {match} or
326- \code {search}. The 0-th tuple gives the beginning and end or the
327+ are relative to the string argument passed to \code {match() } or
328+ \code {search() }. The 0-th tuple gives the beginning and end or the
327329whole pattern. When the last match or search failed, this is
328330\code {None}.
329331\end {datadesc }
330332
331333\begin {datadesc }{last}
332- When the last call to the \code {match} or \code {search} method found a
334+ When the last call to the \code {match() } or \code {search() } method found a
333335match, this is the string argument passed to that method. When the
334336last match or search failed, this is \code {None}.
335337\end {datadesc }
336338
337339\begin {datadesc }{translate}
338340This is the value of the \var {translate} argument to
339- \code {regex.compile} that created this regular expression object. If
340- the \var {translate} argument was omitted in the \code {regex.compile}
341+ \code {regex.compile() } that created this regular expression object. If
342+ the \var {translate} argument was omitted in the \code {regex.compile() }
341343call, this is \code {None}.
342344\end {datadesc }
343345
344346\begin {datadesc }{givenpat}
345- The regular expression pattern as passed to \code {compile} or
346- \code {symcomp}.
347+ The regular expression pattern as passed to \code {compile() } or
348+ \code {symcomp() }.
347349\end {datadesc }
348350
349351\begin {datadesc }{realpat}
350352The regular expression after stripping the group names for regular
351- expressions compiled with \code {symcomp}. Same as \code {givenpat}
353+ expressions compiled with \code {symcomp() }. Same as \code {givenpat}
352354otherwise.
353355\end {datadesc }
354356
355357\begin {datadesc }{groupindex}
356358A dictionary giving the mapping from symbolic group names to numerical
357- group indices for regular expressions compiled with \code {symcomp}.
359+ group indexes for regular expressions compiled with \code {symcomp() }.
358360\code {None} otherwise.
359361\end {datadesc }
0 commit comments