Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 25211f5

Browse files
committed
Added more information on the differences between the htmllib and HTMLParser
modules.
1 parent 5fe2c13 commit 25211f5

3 files changed

Lines changed: 16 additions & 3 deletions

File tree

Doc/lib/libhtmllib.tex

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,12 @@ \section{\module{htmllib} ---
7070

7171

7272
\begin{seealso}
73+
\seemodule{HTMLParser}{Alternate HTML parser that offers a slightly
74+
lower-level view of the input, but is
75+
designed to work with XHTML, and does not
76+
implement some of the SGML syntax not used in
77+
``HTML as deployed'' and which isn't legal
78+
for XHTML.}
7379
\seemodule{htmlentitydefs}{Definition of replacement text for HTML
7480
2.0 entities.}
7581
\seemodule{sgmllib}{Base class for \class{HTMLParser}.}

Doc/lib/libhtmlparser.tex

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ \section{\module{HTMLParser} ---
66

77
This module defines a class \class{HTMLParser} which serves as the
88
basis for parsing text files formatted in HTML\index{HTML} (HyperText
9-
Mark-up Language) and XHTML.\index{XHTML}
9+
Mark-up Language) and XHTML.\index{XHTML} Unlike the parser in
10+
\refmodule{htmllib}, this parser is not based on the SGML parser in
11+
\refmodule{sgmllib}.
1012

1113

1214
\begin{classdesc}{HTMLParser}{}
@@ -15,6 +17,10 @@ \section{\module{HTMLParser} ---
1517
An HTMLParser instance is fed HTML data and calls handler functions
1618
when tags begin and end. The \class{HTMLParser} class is meant to be
1719
overridden by the user to provide a desired behavior.
20+
21+
Unlike the parser in \refmodule{htmllib}, this parser does not check
22+
that end tags match start tags or call the end-tag handler for
23+
elements which are closed implicitly by closing an outer element.
1824
\end{classdesc}
1925

2026

Doc/lib/libsgmllib.tex

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,9 @@ \section{\module{sgmllib} ---
1010
basis for parsing text files formatted in SGML (Standard Generalized
1111
Mark-up Language). In fact, it does not provide a full SGML parser
1212
--- it only parses SGML insofar as it is used by HTML, and the module
13-
only exists as a base for the \refmodule{htmllib}\refstmodindex{htmllib}
14-
module.
13+
only exists as a base for the \refmodule{htmllib} module. Another
14+
HTML parser which supports XHTML and offers a somewhat different
15+
interface is available in the \refmodule{HTMLParser} module.
1516

1617

1718
\begin{classdesc}{SGMLParser}{}

0 commit comments

Comments
 (0)