@@ -306,3 +306,326 @@ added matters. To illustrate::
306306 ``7bit ``, non-ascii binary data is CTE encoded using the ``unknown-8bit ``
307307 charset. Otherwise the original source header is used, with its existing
308308 line breaks and and any (RFC invalid) binary data it may contain.
309+
310+
311+ .. note ::
312+
313+ The remainder of the classes documented below are included in the standard
314+ library on a :term: `provisional basis <provisional package> `. Backwards
315+ incompatible changes (up to and including removal of the feature) may occur
316+ if deemed necessary by the core developers.
317+
318+
319+ .. class :: EmailPolicy(**kw)
320+
321+ This concrete :class: `Policy ` provides behavior that is intended to be fully
322+ compliant with the current email RFCs. These include (but are not limited
323+ to) :rfc: `5322 `, :rfc: `2047 `, and the current MIME RFCs.
324+
325+ This policy adds new header parsing and folding algorithms. Instead of
326+ simple strings, headers are custom objects with custom attributes depending
327+ on the type of the field. The parsing and folding algorithm fully implement
328+ :rfc: `2047 ` and :rfc: `5322 `.
329+
330+ In addition to the settable attributes listed above that apply to all
331+ policies, this policy adds the following additional attributes:
332+
333+ .. attribute :: refold_source
334+
335+ If the value for a header in the ``Message `` object originated from a
336+ :mod: `~email.parser ` (as opposed to being set by a program), this
337+ attribute indicates whether or not a generator should refold that value
338+ when transforming the message back into stream form. The possible values
339+ are:
340+
341+ ======== ===============================================================
342+ ``none `` all source values use original folding
343+
344+ ``long `` source values that have any line that is longer than
345+ ``max_line_length `` will be refolded
346+
347+ ``all `` all values are refolded.
348+ ======== ===============================================================
349+
350+ The default is ``long ``.
351+
352+ .. attribute :: header_factory
353+
354+ A callable that takes two arguments, ``name `` and ``value ``, where
355+ ``name `` is a header field name and ``value `` is an unfolded header field
356+ value, and returns a string-like object that represents that header. A
357+ default ``header_factory `` is provided that understands some of the
358+ :RFC: `5322 ` header field types. (Currently address fields and date
359+ fields have special treatment, while all other fields are treated as
360+ unstructured. This list will be completed before the extension is marked
361+ stable.)
362+
363+ The class provides the following concrete implementations of the abstract
364+ methods of :class: `Policy `:
365+
366+ .. method :: header_source_parse(sourcelines)
367+
368+ The implementation of this method is the same as that for the
369+ :class: `Compat32 ` policy.
370+
371+ .. method :: header_store_parse(name, value)
372+
373+ The name is returned unchanged. If the input value has a ``name ``
374+ attribute and it matches *name * ignoring case, the value is returned
375+ unchanged. Otherwise the *name * and *value * are passed to
376+ ``header_factory ``, and the resulting custom header object is returned as
377+ the value. In this case a ``ValueError `` is raised if the input value
378+ contains CR or LF characters.
379+
380+ .. method :: header_fetch_parse(name, value)
381+
382+ If the value has a ``name `` attribute, it is returned to unmodified.
383+ Otherwise the *name *, and the *value * with any CR or LF characters
384+ removed, are passed to the ``header_factory ``, and the resulting custom
385+ header object is returned. Any surrogateescaped bytes get turned into
386+ the unicode unknown-character glyph.
387+
388+ .. method :: fold(name, value)
389+
390+ Header folding is controlled by the :attr: `refold_source ` policy setting.
391+ A value is considered to be a 'source value' if and only if it does not
392+ have a ``name `` attribute (having a ``name `` attribute means it is a
393+ header object of some sort). If a source value needs to be refolded
394+ according to the policy, it is converted into a custom header object by
395+ passing the *name * and the *value * with any CR and LF characters removed
396+ to the ``header_factory ``. Folding of a custom header object is done by
397+ calling its ``fold `` method with the current policy.
398+
399+ Source values are split into lines using :meth: `~str.splitlines `. If
400+ the value is not to be refolded, the lines are rejoined using the
401+ ``linesep `` from the policy and returned. The exception is lines
402+ containing non-ascii binary data. In that case the value is refolded
403+ regardless of the ``refold_source `` setting, which causes the binary data
404+ to be CTE encoded using the ``unknown-8bit `` charset.
405+
406+ .. method :: fold_binary(name, value)
407+
408+ The same as :meth: `fold ` if :attr: `cte_type ` is ``7bit ``, except that
409+ the returned value is bytes.
410+
411+ If :attr: `cte_type ` is ``8bit ``, non-ASCII binary data is converted back
412+ into bytes. Headers with binary data are not refolded, regardless of the
413+ ``refold_header `` setting, since there is no way to know whether the
414+ binary data consists of single byte characters or multibyte characters.
415+
416+ The following instances of :class: `EmailPolicy ` provide defaults suitable for
417+ specific application domains. Note that in the future the behavior of these
418+ instances (in particular the ``HTTP` instance) may be adjusted to conform even
419+ more closely to the RFCs relevant to their domains.
420+
421+ .. data:: default
422+
423+ An instance of ``EmailPolicy `` with all defaults unchanged. This policy
424+ uses the standard Python ``\n `` line endings rather than the RFC-correct
425+ ``\r\n ``.
426+
427+ .. data :: SMTP
428+
429+ Suitable for serializing messages in conformance with the email RFCs.
430+ Like ``default ``, but with ``linesep `` set to ``\r\n ``, which is RFC
431+ compliant.
432+
433+ .. data :: HTTP
434+
435+ Suitable for serializing headers with for use in HTTP traffic. Like
436+ ``SMTP `` except that ``max_line_length `` is set to ``None `` (unlimited).
437+
438+ .. data :: strict
439+
440+ Convenience instance. The same as ``default `` except that
441+ ``raise_on_defect `` is set to ``True ``. This allows any policy to be made
442+ strict by writing::
443+
444+ somepolicy + policy.strict
445+
446+ With all of these :class: `EmailPolicies <.EmailPolicy> `, the effective API of
447+ the email package is changed from the Python 3.2 API in the following ways:
448+
449+ * Setting a header on a :class: `~email.message.Message ` results in that
450+ header being parsed and a custom header object created.
451+
452+ * Fetching a header value from a :class: `~email.message.Message ` results
453+ in that header being parsed and a custom header object created and
454+ returned.
455+
456+ * Any custom header object, or any header that is refolded due to the
457+ policy settings, is folded using an algorithm that fully implements the
458+ RFC folding algorithms, including knowing where encoded words are required
459+ and allowed.
460+
461+ From the application view, this means that any header obtained through the
462+ :class: `~email.message.Message ` is a custom header object with custom
463+ attributes, whose string value is the fully decoded unicode value of the
464+ header. Likewise, a header may be assigned a new value, or a new header
465+ created, using a unicode string, and the policy will take care of converting
466+ the unicode string into the correct RFC encoded form.
467+
468+ The custom header objects and their attributes are described below. All custom
469+ header objects are string subclasses, and their string value is the fully
470+ decoded value of the header field (the part of the field after the ``: ``)
471+
472+
473+ .. class :: BaseHeader
474+
475+ This is the base class for all custom header objects. It provides the
476+ following attributes:
477+
478+ .. attribute :: name
479+
480+ The header field name (the portion of the field before the ':').
481+
482+ .. attribute :: defects
483+
484+ A possibly empty list of :class: `~email.errors.MessageDefect ` objects
485+ that record any RFC violations found while parsing the header field.
486+
487+ .. method :: fold(*, policy)
488+
489+ Return a string containing :attr: `~email.policy.Policy.linesep `
490+ characters as required to correctly fold the header according
491+ to *policy *. A :attr: `~email.policy.Policy.cte_type ` of
492+ ``8bit `` will be treated as if it were ``7bit ``, since strings
493+ may not contain binary data.
494+
495+
496+ .. class :: UnstructuredHeader
497+
498+ The class used for any header that does not have a more specific
499+ type. (The :mailheader: `Subject ` header is an example of an
500+ unstructured header.) It does not have any additional attributes.
501+
502+
503+ .. class :: DateHeader
504+
505+ The value of this type of header is a single date and time value. The
506+ primary example of this type of header is the :mailheader: `Date ` header.
507+
508+ .. attribute :: datetime
509+
510+ A :class: `~datetime.datetime ` encoding the date and time from the
511+ header value.
512+
513+ The ``datetime `` will be a naive ``datetime `` if the value either does
514+ not have a specified timezone (which would be a violation of the RFC) or
515+ if the timezone is specified as ``-0000 ``. This timezone value indicates
516+ that the date and time is to be considered to be in UTC, but with no
517+ indication of the local timezone in which it was generated. (This
518+ contrasts to ``+0000 ``, which indicates a date and time that really is in
519+ the UTC ``0000 `` timezone.)
520+
521+ If the header value contains a valid timezone that is not ``-0000 ``, the
522+ ``datetime `` will be an aware ``datetime `` having a
523+ :class: `~datetime.tzinfo ` set to the :class: `~datetime.timezone `
524+ indicated by the header value.
525+
526+ A ``datetime `` may also be assigned to a :mailheader: `Date ` type header.
527+ The resulting string value will use a timezone of ``-0000 `` if the
528+ ``datetime `` is naive, and the appropriate UTC offset if the ``datetime `` is
529+ aware.
530+
531+
532+ .. class :: AddressHeader
533+
534+ This class is used for all headers that can contain addresses, whether they
535+ are supposed to be singleton addresses or a list.
536+
537+ .. attribute :: addresses
538+
539+ A list of :class: `.Address ` objects listing all of the addresses that
540+ could be parsed out of the field value.
541+
542+ .. attribute :: groups
543+
544+ A list of :class: `.Group ` objects. Every address in :attr: `.addresses `
545+ appears in one of the group objects in the tuple. Addresses that are not
546+ syntactically part of a group are represented by ``Group `` objects whose
547+ ``name `` is ``None ``.
548+
549+ In addition to addresses in string form, any combination of
550+ :class: `.Address ` and :class: `.Group ` objects, singly or in a list, may be
551+ assigned to an address header.
552+
553+
554+ .. class :: Address(display_name='', username='', domain='', addr_spec=None):
555+
556+ The class used to represent an email address. The general form of an
557+ address is::
558+
559+ [display_name] <username@domain>
560+
561+ or::
562+
563+ username@domain
564+
565+ where each part must conform to specific syntax rules spelled out in
566+ :rfc: `5322 `.
567+
568+ As a convenience *addr_spec * can be specified instead of *username * and
569+ *domain *, in which case *username * and *domain * will be parsed from the
570+ *addr_spec *. An *addr_spec * must be a properly RFC quoted string; if it is
571+ not ``Address `` will raise an error. Unicode characters are allowed and
572+ will be property encoded when serialized. However, per the RFCs, unicode is
573+ *not * allowed in the username portion of the address.
574+
575+ .. attribute :: display_name
576+
577+ The display name portion of the address, if any, with all quoting
578+ removed. If the address does not have a display name, this attribute
579+ will be an empty string.
580+
581+ .. attribute :: username
582+
583+ The ``username `` portion of the address, with all quoting removed.
584+
585+ .. attribute :: domain
586+
587+ The ``domain `` portion of the address.
588+
589+ .. attribute :: addr_spec
590+
591+ The ``username@domain `` portion of the address, correctly quoted
592+ for use as a bare address (the second form shown above). This
593+ attribute is not mutable.
594+
595+ .. method :: __str__()
596+
597+ The ``str `` value of the object is the address quoted according to
598+ :rfc: `5322 ` rules, but with no Content Transfer Encoding of any non-ASCII
599+ characters.
600+
601+
602+ .. class :: Group(display_name=None, addresses=None)
603+
604+ The class used to represent an address group. The general form of an
605+ address group is::
606+
607+ display_name: [address-list];
608+
609+ As a convenience for processing lists of addresses that consist of a mixture
610+ of groups and single addresses, a ``Group `` may also be used to represent
611+ single addresses that are not part of a group by setting *display_name * to
612+ ``None `` and providing a list of the single address as *addresses *.
613+
614+ .. attribute :: display_name
615+
616+ The ``display_name `` of the group. If it is ``None `` and there is
617+ exactly one ``Address `` in ``addresses ``, then the ``Group `` represents a
618+ single address that is not in a group.
619+
620+ .. attribute :: addresses
621+
622+ A possibly empty tuple of :class: `.Address ` objects representing the
623+ addresses in the group.
624+
625+ .. method :: __str__()
626+
627+ The ``str `` value of a ``Group `` is formatted according to :rfc: `5322 `,
628+ but with no Content Transfer Encoding of any non-ASCII characters. If
629+ ``display_name `` is none and there is a single ``Address `` in the
630+ ``addresses` list, the ``str `` value will be the same as the ``str `` of
631+ that single ``Address ``.
0 commit comments