11\section {Standard Module \sectcode {htmllib} }
22\label {module-htmllib }
33\stmodindex {htmllib}
4- \rfcindex {1866}
54\index {HTML}
65\index {hypertext}
76
@@ -12,32 +11,33 @@ \section{Standard Module \sectcode{htmllib}}
1211is not directly concerned with I/O --- it must be provided with input
1312in string form via a method, and makes calls to methods of a
1413`` formatter'' object in order to produce output. The
15- \code {HTMLParser} class is designed to be used as a base class for
14+ \class {HTMLParser} class is designed to be used as a base class for
1615other classes in order to add functionality, and allows most of its
1716methods to be extended or overridden. In turn, this class is derived
18- from and extends the \code {SGMLParser} class defined in module
19- \code {sgmllib}. Two implementations of formatter objects are
20- provided in the \code {formatter} module; refer to the documentation
21- for that module for information on the formatter interface.
17+ from and extends the \class {SGMLParser} class defined in module
18+ \module {sgmllib}\refstmodindex {sgmllib}. The \class {HTMLParser}
19+ implementation supports the HTML 2.0 language as described in
20+ \rfc {1866}. Two implementations of formatter objects are provided in
21+ the \module {formatter}\refstmodindex {formatter} module; refer to the
22+ documentation for that module for information on the formatter
23+ interface.
2224\index {SGML}
23- \refstmodindex {sgmllib}
2425\ttindex {SGMLParser}
2526\index {formatter}
26- \refstmodindex {formatter}
2727
2828The following is a summary of the interface defined by
29- \code {sgmllib.SGMLParser}:
29+ \class {sgmllib.SGMLParser}:
3030
3131\begin {itemize }
3232
3333\item
34- The interface to feed data to an instance is through the \code {feed()}
34+ The interface to feed data to an instance is through the \method {feed()}
3535method, which takes a string argument. This can be called with as
36- little or as much text at a time as desired; \code {p.feed(a);
37- p.feed(b)} has the same effect as \code {p.feed(a+b)}. When the data
36+ little or as much text at a time as desired; \samp {p.feed(a);
37+ p.feed(b)} has the same effect as \samp {p.feed(a+b)}. When the data
3838contains complete HTML tags, these are processed immediately;
3939incomplete elements are saved in a buffer. To force processing of all
40- unprocessed data, call the \code {close()} method.
40+ unprocessed data, call the \method {close()} method.
4141
4242For example, to parse the entire contents of a file, use:
4343\bcode \begin {verbatim }
@@ -50,13 +50,13 @@ \section{Standard Module \sectcode{htmllib}}
5050a class and define methods called \code {start_\var {tag}()},
5151\code {end_\var {tag}()}, or \code {do_\var {tag}()}. The parser will
5252call these at appropriate moments: \code {start_\var {tag}} or
53- \code {do_\var {tag}} is called when an opening tag of the form
54- \code {<\var {tag} ...>} is encountered; \code {end_\var {tag}} is called
53+ \code {do_\var {tag}() } is called when an opening tag of the form
54+ \code {<\var {tag} ...>} is encountered; \code {end_\var {tag}() } is called
5555when a closing tag of the form \code {<\var {tag}>} is encountered. If
5656an opening tag requires a corresponding closing tag, like \code {<H1>}
57- ... \code {</H1>}, the class should define the \code {start_\var {tag}}
57+ ... \code {</H1>}, the class should define the \code {start_\var {tag}() }
5858method; if a tag requires no closing tag, like \code {<P>}, the class
59- should define the \code {do_\var {tag}} method.
59+ should define the \code {do_\var {tag}() } method.
6060
6161\end {itemize }
6262
@@ -68,10 +68,10 @@ \section{Standard Module \sectcode{htmllib}}
6868handlers for all HTML 2.0 and many HTML 3.0 and 3.2 elements.
6969\end {funcdesc }
7070
71- In addition to tag methods, the \code {HTMLParser} class provides some
71+ In addition to tag methods, the \class {HTMLParser} class provides some
7272additional methods and instance variables for use within tag methods.
7373
74- \renewcommand {\indexsubitem }{(HTMLParser method )}
74+ \renewcommand {\indexsubitem }{(HTMLParser attribute )}
7575
7676\begin {datadesc }{formatter}
7777This is the formatter instance associated with the parser.
@@ -82,40 +82,42 @@ \section{Standard Module \sectcode{htmllib}}
8282collapsed, or false when it should be. In general, this should only
8383be true when character data is to be treated as `` preformatted'' text,
8484as within a \code {<PRE>} element. The default value is false. This
85- affects the operation of \code {handle_data()} and \code {save_end()}.
85+ affects the operation of \method {handle_data()} and \method {save_end()}.
8686\end {datadesc }
8787
88+ \renewcommand {\indexsubitem }{(HTMLParser method)}
89+
8890\begin {funcdesc }{anchor_bgn}{href\, name\, type}
8991This method is called at the start of an anchor region. The arguments
9092correspond to the attributes of the \code {<A>} tag with the same
9193names. The default implementation maintains a list of hyperlinks
92- (defined by the \code {href} argument ) within the document. The list
94+ (defined by the \code {href} attribute ) within the document. The list
9395of hyperlinks is available as the data attribute \code {anchorlist}.
9496\end {funcdesc }
9597
9698\begin {funcdesc }{anchor_end}{}
9799This method is called at the end of an anchor region. The default
98100implementation adds a textual footnote marker using an index into the
99- list of hyperlinks created by \code {anchor_bgn()}.
101+ list of hyperlinks created by \method {anchor_bgn()}.
100102\end {funcdesc }
101103
102104\begin {funcdesc }{handle_image}{source\, alt\optional {\, ismap\optional {\, align\optional {\, width\optional {\, height}}}}}
103105This method is called to handle images. The default implementation
104- simply passes the \code {alt} value to the \code {handle_data()}
106+ simply passes the \var {alt} value to the \method {handle_data()}
105107method.
106108\end {funcdesc }
107109
108110\begin {funcdesc }{save_bgn}{}
109111Begins saving character data in a buffer instead of sending it to the
110- formatter object. Retrieve the stored data via \code {save_end()}
111- Use of the \code {save_bgn()} / \code {save_end()} pair may not be
112+ formatter object. Retrieve the stored data via \method {save_end()}.
113+ Use of the \method {save_bgn()} / \method {save_end()} pair may not be
112114nested.
113115\end {funcdesc }
114116
115117\begin {funcdesc }{save_end}{}
116118Ends buffering character data and returns all data saved since the
117- preceeding call to \code {save_bgn()}. If \code {nofill} flag is false,
118- whitespace is collapsed to single spaces. A call to this method
119- without a preceeding call to \code {save_bgn()} will raise a
120- \code {TypeError} exception.
119+ preceeding call to \method {save_bgn()}. If the \code {nofill} flag is
120+ false, whitespace is collapsed to single spaces. A call to this
121+ method without a preceeding call to \method {save_bgn()} will raise a
122+ \exception {TypeError} exception.
121123\end {funcdesc }
0 commit comments