|
| 1 | +\section{Standard Module \sectcode{formatter}} |
| 2 | +\stmodindex{formatter} |
| 3 | + |
| 4 | +\renewcommand{\indexsubitem}{(in module formatter)} |
| 5 | + |
| 6 | +This module supports two interface definitions, each with mulitple |
| 7 | +implementations. The \emph{formatter} interface is used by the |
| 8 | +\code{HTMLParser} class of the \code{htmllib} module, and the |
| 9 | +\emph{writer} interface is required by the formatter interface. |
| 10 | + |
| 11 | +Formatter objects transform an abstract flow of formatting events into |
| 12 | +specific output events on writer objects. Formatters manage several |
| 13 | +stack structures to allow various properties of a writer object to be |
| 14 | +changed and restored; writers need not be able to handle relative |
| 15 | +changes nor any sort of ``change back'' operation. Specific writer |
| 16 | +properties which may be controlled via formatter objects are |
| 17 | +horizontal alignment, font, and left margin indentations. A mechanism |
| 18 | +is provided which supports providing arbitrary, non-exclusive style |
| 19 | +settings to a writer as well. Additional interfaces facilitate |
| 20 | +formatting events which are not reversible, such as paragraph |
| 21 | +separation. |
| 22 | + |
| 23 | +Writer objects encapsulate device interfaces. Abstract devices, such |
| 24 | +as file formats, are supported as well as physical devices. The |
| 25 | +provided implementations all work with abstract devices. The |
| 26 | +interface makes available mechanisms for setting the properties which |
| 27 | +formatter objects manage and inserting data into the output. |
| 28 | + |
| 29 | + |
| 30 | +\subsection{The Formatter Interface} |
| 31 | + |
| 32 | +Interfaces to create formatters are dependent on the specific |
| 33 | +formatter class being instantiated. The interfaces described below |
| 34 | +are the required interfaces which all formatters must support once |
| 35 | +initialized. |
| 36 | + |
| 37 | +One data element is defined at the module level: |
| 38 | + |
| 39 | +\begin{datadesc}{AS_IS} |
| 40 | +Value which can be used in the font specification passed to the |
| 41 | +\code{push_font()} method described below, or as the new value to any |
| 42 | +other \code{push_\var{property}()} method. Pushing the \code{AS_IS} |
| 43 | +value allows the corresponding \code{pop_\var{property}()} method to |
| 44 | +be called without having to track whether the property was changed. |
| 45 | +\end{datadesc} |
| 46 | + |
| 47 | +The following attributes are defined for formatter instance objects: |
| 48 | + |
| 49 | +\begin{datadesc}{writer} |
| 50 | +The writer instance with which the formatter interacts. |
| 51 | +\end{datadesc} |
| 52 | + |
| 53 | + |
| 54 | +\begin{funcdesc}{end_paragraph}{blanklines} |
| 55 | +Close any open paragraphs and insert at least \code{blanklines} |
| 56 | +before the next paragraph. |
| 57 | +\end{funcdesc} |
| 58 | + |
| 59 | +\begin{funcdesc}{add_line_break}{} |
| 60 | +Add a hard line break if one does not already exist. This does not |
| 61 | +break the logical paragraph. |
| 62 | +\end{funcdesc} |
| 63 | + |
| 64 | +\begin{funcdesc}{add_hor_rule}{*args\, **kw} |
| 65 | +Insert a horizontal rule in the output. A hard break is inserted if |
| 66 | +there is data in the current paragraph, but the logical paragraph is |
| 67 | +not broken. The arguments and keywords are passed on to the writer's |
| 68 | +\code{send_line_break()} method. |
| 69 | +\end{funcdesc} |
| 70 | + |
| 71 | +\begin{funcdesc}{add_flowing_data}{data} |
| 72 | +Provide data which should be formatted with collapsed whitespaces. |
| 73 | +Whitespace from preceeding and successive calls to |
| 74 | +\code{add_flowing_data()} is considered as well when the whitespace |
| 75 | +collapse is performed. The data which is passed to this method is |
| 76 | +expected to be word-wrapped by the output device. Note that any |
| 77 | +word-wrapping still must be performed by the writer object due to the |
| 78 | +need to rely on device and font information. |
| 79 | +\end{funcdesc} |
| 80 | + |
| 81 | +\begin{funcdesc}{add_literal_data}{data} |
| 82 | +Provide data which should be passed to the writer unchanged. |
| 83 | +Whitespace, including newline and tab characters, are considered legal |
| 84 | +in the value of \code{data}. |
| 85 | +\end{funcdesc} |
| 86 | + |
| 87 | +\begin{funcdesc}{add_label_data}{format, counter} |
| 88 | +Insert a label which should be placed to the left of the current left |
| 89 | +margin. This should be used for constructing bulleted or numbered |
| 90 | +lists. If the \code{format} value is a string, it is interpreted as a |
| 91 | +format specification for \code{counter}, which should be an integer. |
| 92 | +The result of this formatting becomes the value of the label; if |
| 93 | +\code{format} is not a string it is used as the label value directly. |
| 94 | +The label value is passed as the only argument to the writer's |
| 95 | +\code{send_label_data()} method. Interpretation of non-string label |
| 96 | +values is dependent on the associated writer. |
| 97 | + |
| 98 | +Format specifications are strings which, in combination with a counter |
| 99 | +value, are used to compute label values. Each character in the format |
| 100 | +string is copied to the label value, with some characters recognized |
| 101 | +to indicate a transform on the counter value. Specifically, the |
| 102 | +character ``\code{1}'' represents the counter value formatter as an |
| 103 | +arabic number, the characters ``\code{A}'' and ``\code{a}'' represent |
| 104 | +alphabetic representations of the counter value in upper and lower |
| 105 | +case, respectively, and ``\code{I}'' and ``\code{i}'' represent the |
| 106 | +counter value in Roman numerals, in upper and lower case. Note that |
| 107 | +the alphabetic and roman transforms require that the counter value be |
| 108 | +greater than zero. |
| 109 | +\end{funcdesc} |
| 110 | + |
| 111 | +\begin{funcdesc}{flush_softspace}{} |
| 112 | +Send any pending whitespace buffered from a previous call to |
| 113 | +\code{add_flowing_data()} to the associated writer object. This |
| 114 | +should be called before any direct manipulation of the writer object. |
| 115 | +\end{funcdesc} |
| 116 | + |
| 117 | +\begin{funcdesc}{push_alignment}{align} |
| 118 | +Push a new alignment setting onto the alignment stack. This may be |
| 119 | +\code{AS_IS} if no change is desired. If the alignment value is |
| 120 | +changed from the previous setting, the writer's \code{new_alignment()} |
| 121 | +method is called with the \code{align} value. |
| 122 | +\end{funcdesc} |
| 123 | + |
| 124 | +\begin{funcdesc}{pop_alignment}{} |
| 125 | +Restore the previous alignment. |
| 126 | +\end{funcdesc} |
| 127 | + |
| 128 | +\begin{funcdesc}{push_font}{(size, italic, bold, teletype)} |
| 129 | +Change some or all font properties of the writer object. Properties |
| 130 | +which are not set to \code{AS_IS} are set to the values passed in |
| 131 | +while others are maintained at their current settings. The writer's |
| 132 | +\code{new_font()} method is called with the fully resolved font |
| 133 | +specification. |
| 134 | +\end{funcdesc} |
| 135 | + |
| 136 | +\begin{funcdesc}{pop_font}{} |
| 137 | +Restore the previous font. |
| 138 | +\end{funcdesc} |
| 139 | + |
| 140 | +\begin{funcdesc}{push_margin}{margin} |
| 141 | +Increase the number of left margin indentations by one, associating |
| 142 | +the logical tag \code{margin} with the new indentation. The initial |
| 143 | +margin level is \code{0}. Changed values of the logical tag must be |
| 144 | +true values; false values other than \code{AS_IS} are not sufficient |
| 145 | +to change the margin. |
| 146 | +\end{funcdesc} |
| 147 | + |
| 148 | +\begin{funcdesc}{pop_margin}{} |
| 149 | +Restore the previous margin. |
| 150 | +\end{funcdesc} |
| 151 | + |
| 152 | +\begin{funcdesc}{push_style}{*styles} |
| 153 | +Push any number of arbitrary style specifications. All styles are |
| 154 | +pushed onto the styles stack in order. A tuple representing the |
| 155 | +entire stack, including \code{AS_IS} values, is passed to the writer's |
| 156 | +\code{new_styles()} method. |
| 157 | +\end{funcdesc} |
| 158 | + |
| 159 | +\begin{funcdesc}{pop_style}{\optional{n\code{ = 1}}} |
| 160 | +Pop the last \code{n} style specifications passed to |
| 161 | +\code{push_style()}. A tuple representing the revised stack, |
| 162 | +including \code{AS_IS} values, is passed to the writer's |
| 163 | +\code{new_styles()} method. |
| 164 | +\end{funcdesc} |
| 165 | + |
| 166 | +\begin{funcdesc}{set_spacing}{spacing} |
| 167 | +Set the spacing style for the writer. |
| 168 | +\end{funcdesc} |
| 169 | + |
| 170 | +\begin{funcdesc}{assert_line_data}{\optional{flag\code{ = 1}}} |
| 171 | +Inform the formatter that data has been added to the current paragraph |
| 172 | +out-of-band. This should be used when the writer has been manipulated |
| 173 | +directly. The optional \code{flag} argument can be set to false if |
| 174 | +the writer manipulations produced a hard line break at the end of the |
| 175 | +output. |
| 176 | +\end{funcdesc} |
| 177 | + |
| 178 | + |
| 179 | +\subsection{Formatter Implementations} |
| 180 | + |
| 181 | +\begin{funcdesc}{NullFormatter}{\optional{writer\code{ = None}}} |
| 182 | +A formatter which does nothing. If \code{writer} is omitted, a |
| 183 | +\code{NullWriter} instance is created. No methods of the writer are |
| 184 | +called by \code{NullWriter} instances. |
| 185 | +\end{funcdesc} |
| 186 | + |
| 187 | +\begin{funcdesc}{AbstractFormatter}{writer} |
| 188 | +The standard formatter. This implementation has demonstrated wide |
| 189 | +applicability to many writers, and may be used directly in most |
| 190 | +circumstances. |
| 191 | +\end{funcdesc} |
| 192 | + |
| 193 | + |
| 194 | + |
| 195 | +\subsection{The Writer Interface} |
| 196 | + |
| 197 | +Interfaces to create writers are dependent on the specific writer |
| 198 | +class being instantiated. The interfaces described below are the |
| 199 | +required interfaces which all writers must support once initialized. |
| 200 | +Note that while most applications can use the \code{AbstractFormatter} |
| 201 | +class as a formatter, the writer must typically be provided by the |
| 202 | +application. |
| 203 | + |
| 204 | +\begin{funcdesc}{new_alignment}{align} |
| 205 | +Set the alignment style. The \code{align} value can be any object, |
| 206 | +but by convention is a string or \code{None}, where \code{None} |
| 207 | +indicates that the writer's ``preferred'' alignment should be used. |
| 208 | +Conventional \code{align} values are \code{'left'}, \code{'center'}, |
| 209 | +\code{'right'}, and \code{'justify'}. |
| 210 | +\end{funcdesc} |
| 211 | + |
| 212 | +\begin{funcdesc}{new_font}{font} |
| 213 | +Set the font style. The value of \code{font} will be \code{None}, |
| 214 | +indicating that the device's default font should be used, or a tuple |
| 215 | +of the form (\var{size}, \var{italic}, \var{bold}, \var{teletype}). |
| 216 | +Size will be a string indicating the size of font that should be used; |
| 217 | +specific strings and their interpretation must be defined by the |
| 218 | +application. The \var{italic}, \var{bold}, and \var{teletype} values |
| 219 | +are boolean indicators specifying which of those font attributes |
| 220 | +should be used. |
| 221 | +\end{funcdesc} |
| 222 | + |
| 223 | +\begin{funcdesc}{new_margin}{margin, level} |
| 224 | +Set the margin level to the integer \code{level} and the logical tag |
| 225 | +to \code{margin}. Interpretation of the logical tag is at the |
| 226 | +writer's discretion; the only restriction on the value of the logical |
| 227 | +tag is that it not be a false value for non-zero values of |
| 228 | +\code{level}. |
| 229 | +\end{funcdesc} |
| 230 | + |
| 231 | +\begin{funcdesc}{new_spacing}{spacing} |
| 232 | +Set the spacing style to \code{spacing}. |
| 233 | +\end{funcdesc} |
| 234 | + |
| 235 | +\begin{funcdesc}{new_styles}{styles} |
| 236 | +Set additional styles. The \code{styles} value is a tuple of |
| 237 | +arbitrary values; the value \code{AS_IS} should be ignored. The |
| 238 | +\code{styles} tuple may be interpreted either as a set or as a stack |
| 239 | +depending on the requirements of the application and writer |
| 240 | +implementation. |
| 241 | +\end{funcdesc} |
| 242 | + |
| 243 | +\begin{funcdesc}{send_line_break}{} |
| 244 | +Break the current line. |
| 245 | +\end{funcdesc} |
| 246 | + |
| 247 | +\begin{funcdesc}{send_paragraph}{blankline} |
| 248 | +Produce a paragraph separation of at least \code{blankline} blank |
| 249 | +lines, or the equivelent. The \code{blankline} value will be an |
| 250 | +integer. |
| 251 | +\end{funcdesc} |
| 252 | + |
| 253 | +\begin{funcdesc}{send_hor_rule}{*args\, **kw} |
| 254 | +Display a horizontal rule on the output device. The arguments to this |
| 255 | +method are entirely application- and writer-specific, and should be |
| 256 | +interpreted with care. The method implementation may assume that a |
| 257 | +line break has already been issued via \code{send_line_break()}. |
| 258 | +\end{funcdesc} |
| 259 | + |
| 260 | +\begin{funcdesc}{send_flowing_data}{data} |
| 261 | +Output character data which may be word-wrapped and re-flowed as |
| 262 | +needed. Within any sequence of calls to this method, the writer may |
| 263 | +assume that spans of multiple whitespace characters have been |
| 264 | +collapsed to single space characters. |
| 265 | +\end{funcdesc} |
| 266 | + |
| 267 | +\begin{funcdesc}{send_literal_data}{data} |
| 268 | +Output character data which has already been formatted |
| 269 | +for display. Generally, this should be interpreted to mean that line |
| 270 | +breaks indicated by newline characters should be preserved and no new |
| 271 | +line breaks should be introduced. The data may contain embedded |
| 272 | +newline and tab characters, unlike data provided to the |
| 273 | +\code{send_formatted_data()} interface. |
| 274 | +\end{funcdesc} |
| 275 | + |
| 276 | +\begin{funcdesc}{send_label_data}{data} |
| 277 | +Set \code{data} to the left of the current left margin, if possible. |
| 278 | +The value of \code{data} is not restricted; treatment of non-string |
| 279 | +values is entirely application- and writer-dependent. This method |
| 280 | +will only be called at the beginning of a line. |
| 281 | +\end{funcdesc} |
| 282 | + |
| 283 | + |
| 284 | +\subsection{Writer Implementations} |
| 285 | + |
| 286 | +\begin{funcdesc}{NullWriter}{} |
| 287 | +A writer which only provides the interface definition; no actions are |
| 288 | +taken on any methods. This should be the base class for all writers |
| 289 | +which do not need to inherit any implementation methods. |
| 290 | +\end{funcdesc} |
| 291 | + |
| 292 | +\begin{funcdesc}{AbstractWriter}{} |
| 293 | +A writer which can be used in debugging formatters, but not much |
| 294 | +else. Each method simply accounces itself by printing its name and |
| 295 | +arguments on standard output. |
| 296 | +\end{funcdesc} |
| 297 | + |
| 298 | +\begin{funcdesc}{DumbWriter}{\optional{file\code{ = None}\optional{\, maxcol\code{ = 72}}}} |
| 299 | +Simple writer class which writes output on the file object passed in |
| 300 | +as \code{file} or, if \code{file} is omitted, on standard output. The |
| 301 | +output is simply word-wrapped to the number of columns specified by |
| 302 | +\code{maxcol}. This class is suitable for reflowing a sequence of |
| 303 | +paragraphs. |
| 304 | +\end{funcdesc} |
0 commit comments