Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 4b73a06

Browse files
committed
Fred Drake's parser module
1 parent c1822a4 commit 4b73a06

2 files changed

Lines changed: 500 additions & 0 deletions

File tree

Doc/lib/libparser.tex

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
% libparser.tex
2+
%
3+
% Introductory documentation for the new parser built-in module.
4+
%
5+
% Copyright 1995 Virginia Polytechnic Institute and State University
6+
% and Fred L. Drake, Jr. This copyright notice must be distributed on
7+
% all copies, but this document otherwise may be distributed as part
8+
% of the Python distribution. No fee may be charged for this document
9+
% in any representation, either on paper or electronically. This
10+
% restriction does not affect other elements in a distributed package
11+
% in any way.
12+
%
13+
14+
\section{Built-in Module \sectcode{parser}}
15+
\bimodindex{parser}
16+
17+
18+
% ==== 2. ====
19+
% Give a short overview of what the module does.
20+
% If it is platform specific, mention this.
21+
% Mention other important restrictions or general operating principles.
22+
23+
The \code{parser} module provides an interface to Python's internal
24+
parser and byte-code compiler. The primary purpose for this interface
25+
is to allow Python code to edit the parse tree of a Python expression
26+
and create executable code from this. This can be better than trying
27+
to parse and modify an arbitrary Python code fragment as a string, and
28+
ensures that parsing is performed in a manner identical to the code
29+
forming the application. It's also faster.
30+
31+
There are a few things to note about this module which are important
32+
to making use of the data structures created. This is not a tutorial
33+
on editing the parse trees for Python code.
34+
35+
Most importantly, a good understanding of the Python grammar processed
36+
by the internal parser is required. For full information on the
37+
language syntax, refer to the Language Reference. The parser itself
38+
is created from a grammar specification defined in the file
39+
\code{Grammar/Grammar} in the standard Python distribution. The parse
40+
trees stored in the ``AST objects'' created by this module are the
41+
actual output from the internal parser when created by the
42+
\code{expr()} or \code{suite()} functions, described below. The AST
43+
objects created by \code{tuple2ast()} faithfully simulate those
44+
structures.
45+
46+
Each element of the tuples returned by \code{ast2tuple()} has a simple
47+
form. Tuples representing non-terminal elements in the grammar always
48+
have a length greater than one. The first element is an integer which
49+
identifies a production in the grammar. These integers are given
50+
symbolic names in the C header file \code{Include/graminit.h} and the
51+
Python module \code{Lib/symbol.py}. Each additional element of the
52+
tuple represents a component of the production as recognized in the
53+
input string: these are always tuples which have the same form as the
54+
parent. An important aspect of this structure which should be noted
55+
is that keywords used to identify the parent node type, such as the
56+
keyword \code{if} in an \emph{if\_stmt}, are included in the node tree
57+
without any special treatment. For example, the \code{if} keyword is
58+
represented by the tuple \code{(1, 'if')}, where \code{1} is the
59+
numeric value associated with all \code{NAME} elements, including
60+
variable and function names defined by the user.
61+
62+
Terminal elements are represented in much the same way, but without
63+
any child elements and the addition of the source text which was
64+
identified. The example of the \code{if} keyword above is
65+
representative. The various types of terminal symbols are defined in
66+
the C header file \code{Include/token.h} and the Python module
67+
\code{Lib/token.py}.
68+
69+
The AST objects are not actually required to support the functionality
70+
of this module, but are provided for three purposes: to allow an
71+
application to amortize the cost of processing complex parse trees, to
72+
provide a parse tree representation which conserves memory space when
73+
compared to the Python tuple representation, and to ease the creation
74+
of additional modules in C which manipulate parse trees. A simple
75+
``wrapper'' module may be created in Python if desired to hide the use
76+
of AST objects.
77+
78+
79+
% ==== 3. ====
80+
% List the public functions defined by the module. Begin with a
81+
% standard phrase. You may also list the exceptions and other data
82+
% items defined in the module, insofar as they are important for the
83+
% user.
84+
85+
The \code{parser} module defines the following functions:
86+
87+
% ---- 3.1. ----
88+
% Redefine the ``indexsubitem'' macro to point to this module
89+
% (alternatively, you can put this at the top of the file):
90+
91+
\renewcommand{\indexsubitem}{(in module parser)}
92+
93+
% ---- 3.2. ----
94+
% For each function, use a ``funcdesc'' block. This has exactly two
95+
% parameters (each parameters is contained in a set of curly braces):
96+
% the first parameter is the function name (this automatically
97+
% generates an index entry); the second parameter is the function's
98+
% argument list. If there are no arguments, use an empty pair of
99+
% curly braces. If there is more than one argument, separate the
100+
% arguments with backslash-comma. Optional parts of the parameter
101+
% list are contained in \optional{...} (this generates a set of square
102+
% brackets around its parameter). Arguments are automatically set in
103+
% italics in the parameter list. Each argument should be mentioned at
104+
% least once in the description; each usage (even inside \code{...})
105+
% should be enclosed in \var{...}.
106+
107+
\begin{funcdesc}{ast2tuple}{ast}
108+
This function accepts an AST object from the caller in
109+
\code{\var{ast}} and returns a Python tuple representing the
110+
equivelent parse tree. The resulting tuple representation can be used
111+
for inspection or the creation of a new parse tree in tuple form.
112+
This function does not fail so long as memory is available to build
113+
the tuple representation.
114+
\end{funcdesc}
115+
116+
117+
\begin{funcdesc}{compileast}{ast\optional{\, filename \code{= '<ast>'}}}
118+
The Python byte compiler can be invoked on an AST object to produce
119+
code objects which can be used as part of an \code{exec} statement or
120+
a call to the built-in \code{eval()} function. This function provides
121+
the interface to the compiler, passing the internal parse tree from
122+
\code{\var{ast}} to the parser, using the source file name specified
123+
by the \code{\var{filename}} parameter. The default value supplied
124+
for \code{\var{filename}} indicates that the source was an AST object.
125+
\end{funcdesc}
126+
127+
128+
\begin{funcdesc}{expr}{string}
129+
The \code{expr()} function parses the parameter \code{\var{string}}
130+
as if it were an input to \code{compile(\var{string}, 'eval')}. If
131+
the parse succeeds, an AST object is created to hold the internal
132+
parse tree representation, otherwise an appropriate exception is
133+
thrown.
134+
\end{funcdesc}
135+
136+
137+
\begin{funcdesc}{isexpr}{ast}
138+
When \code{\var{ast}} represents an \code{'eval'} form, this function
139+
returns a true value (\code{1}), otherwise it returns false
140+
(\code{0}). This is useful, since code objects normally cannot be
141+
queried for this information using existing built-in functions. Note
142+
that the code objects created by \code{compileast()} cannot be queried
143+
like this either, and are identical to those created by the built-in
144+
\code{compile()} function.
145+
\end{funcdesc}
146+
147+
148+
\begin{funcdesc}{issuite}{ast}
149+
This function mirrors \code{isexpr()} in that it reports whether an
150+
AST object represents a suite of statements. It is not safe to assume
151+
that this function is equivelent to \code{not isexpr(\var{ast})}, as
152+
additional syntactic fragments may be supported in the future.
153+
\end{funcdesc}
154+
155+
156+
\begin{funcdesc}{suite}{string}
157+
The \code{suite()} function parses the parameter \code{\var{string}}
158+
as if it were an input to \code{compile(\var{string}, 'exec')}. If
159+
the parse succeeds, an AST object is created to hold the internal
160+
parse tree representation, otherwise an appropriate exception is
161+
thrown.
162+
\end{funcdesc}
163+
164+
165+
\begin{funcdesc}{tuple2ast}{tuple}
166+
This function accepts a parse tree represented as a tuple and builds
167+
an internal representation if possible. If it can validate that the
168+
tree conforms to the Python syntax and all nodes are valid node types
169+
in the host version of Python, an AST object is created from the
170+
internal representation and returned to the called. If there is a
171+
problem creating the internal representation, or if the tree cannot be
172+
validated, a \code{ParserError} exception is thrown. An AST object
173+
created this way should not be assumed to compile correctly; normal
174+
exceptions thrown by compilation may still be initiated when the AST
175+
object is passed to \code{compileast()}. This will normally indicate
176+
problems not related to syntax (such as a \code{MemoryError}
177+
exception).
178+
\end{funcdesc}
179+
180+
181+
% --- 3.4. ---
182+
% Exceptions are described using a ``excdesc'' block. This has only
183+
% one parameter: the exception name.
184+
185+
\subsection{Exceptions and Error Handling}
186+
187+
The parser module defines a single exception, but may also pass other
188+
built-in exceptions from other portions of the Python runtime
189+
environment. See each function for information about the exceptions
190+
it can raise.
191+
192+
\begin{excdesc}{ParserError}
193+
Exception raised when a failure occurs within the parser module. This
194+
is generally produced for validation failures rather than the built in
195+
\code{SyntaxError} thrown during normal parsing.
196+
The exception argument is either a string describing the reason of the
197+
failure or a tuple containing a tuple causing the failure from a parse
198+
tree passed to \code{tuple2ast()} and an explanatory string. Calls to
199+
\code{tuple2ast()} need to be able to handle either type of exception,
200+
while calls to other functions in the module will only need to be
201+
aware of the simple string values.
202+
\end{excdesc}
203+
204+
Note that the functions \code{compileast()}, \code{expr()}, and
205+
\code{suite()} may throw exceptions which are normally thrown by the
206+
parsing and compilation process. These include the built in
207+
exceptions \code{MemoryError}, \code{OverflowError},
208+
\code{SyntaxError}, and \code{SystemError}. In these cases, these
209+
exceptions carry all the meaning normally associated with them. Refer
210+
to the descriptions of each function for detailed information.
211+
212+
% ---- 3.5. ----
213+
% There is no standard block type for classes. I generally use
214+
% ``funcdesc'' blocks, since class instantiation looks very much like
215+
% a function call.
216+
217+
218+
% ==== 4. ====
219+
% Now is probably a good time for a complete example. (Alternatively,
220+
% an example giving the flavor of the module may be given before the
221+
% detailed list of functions.)
222+
223+
\subsection{Example}
224+
225+
A simple example:
226+
227+
\begin{verbatim}
228+
>>> import parser
229+
>>> ast = parser.expr('a + 5')
230+
>>> code = parser.compileast(ast)
231+
>>> a = 5
232+
>>> eval(code)
233+
10
234+
\end{verbatim}
235+
236+
237+
\subsection{AST Objects}
238+
239+
AST objects (returned by \code{expr()}, \code{suite()}, and
240+
\code{tuple2ast()}, described above) have no methods of their own.
241+
Some of the functions defined which accept an AST object as their
242+
first argument may change to object methods in the future.
243+
244+
Ordered and equality comparisons are supported between AST objects.
245+
246+
\renewcommand{\indexsubitem}{(ast method)}
247+
248+
%\begin{funcdesc}{empty}{}
249+
%Empty the can into the trash.
250+
%\end{funcdesc}

0 commit comments

Comments
 (0)