@@ -492,7 +492,27 @@ \section{PEP 285: The \class{bool} Type\label{section-bool}}
492492% ======================================================================
493493\section {PEP 293: Codec Error Handling Callbacks }
494494
495- XXX write this section
495+ When encoding a Unicode string into a byte string, unencodable
496+ characters may be encountered. So far, Python allowed to specify the
497+ error processing as either `` strict'' (raise \code {UnicodeError},
498+ default), `` ignore'' (skip the character), or `` replace'' (with
499+ question mark). It may be desirable to specify an alternative
500+ processing of the error, e.g. by inserting an XML character reference
501+ or HTML entity reference into the converted string.
502+
503+ Python now has a flexible framework to add additional processing
504+ strategies; new error handlers can be added with
505+ \function {codecs.register_error}. Codecs then can access the error
506+ handler with \code {codecs.lookup_error}. An equivalent C API has been
507+ added for codecs written in C. The error handler gets various state
508+ information, such as the string being converted, the position in the
509+ string where the error was detected, and the target encoding. It can
510+ then either raise an exception, or return a replacement string.
511+
512+ Two additional error handlers have been implemented using this
513+ framework: `` backslashreplace'' using Python backslash quoting to
514+ represent the unencodable character, and `` xmlcharrefreplace'' emits
515+ XML character references.
496516
497517\begin {seealso }
498518
0 commit comments