Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 7ec754b

Browse files
committed
#1078919: make add_header automatically do RFC2231 encoding when needed.
Also document the use of three-tuples if control of the charset and language is desired.
1 parent 796343b commit 7ec754b

4 files changed

Lines changed: 66 additions & 5 deletions

File tree

Doc/library/email.message.rst

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,15 @@ Here are the methods of the :class:`Message` class:
270270
taken as the parameter name, with underscores converted to dashes (since
271271
dashes are illegal in Python identifiers). Normally, the parameter will
272272
be added as ``key="value"`` unless the value is ``None``, in which case
273-
only the key will be added.
273+
only the key will be added. If the value contains non-ASCII characters,
274+
it can be specified as a three tuple in the format
275+
``(CHARSET, LANGUAGE, VALUE)``, where ``CHARSET`` is a string naming the
276+
charset to be used to encode the value, ``LANGUAGE`` can usually be set
277+
to ``None`` or the empty string (see :RFC:`2231` for other possibilities),
278+
and ``VALUE`` is the string value containing non-ASCII code points. If
279+
a three tuple is not passed and the value contains non-ASCII characters,
280+
it is automatically encoded in :RFC`2231` format using a ``CHARSET``
281+
of ``utf-8`` and a ``LANGUAGE`` of ``None``.
274282

275283
Here's an example::
276284

@@ -280,6 +288,15 @@ Here are the methods of the :class:`Message` class:
280288

281289
Content-Disposition: attachment; filename="bud.gif"
282290

291+
An example with with non-ASCII characters::
292+
293+
msg.add_header('Content-Disposition', 'attachment',
294+
filename=('iso-8859-1', '', 'Fußballer.ppt'))
295+
296+
Which produces ::
297+
298+
Content-Disposition: attachment; filename*="iso-8859-1''Fu%DFballer.ppt"
299+
283300

284301
.. method:: replace_header(_name, _value)
285302

@@ -369,7 +386,7 @@ Here are the methods of the :class:`Message` class:
369386
:rfc:`2231`, you can collapse the parameter value by calling
370387
:func:`email.utils.collapse_rfc2231_value`, passing in the return value
371388
from :meth:`get_param`. This will return a suitably decoded Unicode
372-
string whn the value is a tuple, or the original string unquoted if it
389+
string when the value is a tuple, or the original string unquoted if it
373390
isn't. For example::
374391

375392
rawparam = msg.get_param('foo')

Lib/email/message.py

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,11 @@ def _splitparam(param):
5757
def _formatparam(param, value=None, quote=True):
5858
"""Convenience function to format and return a key=value pair.
5959
60-
This will quote the value if needed or if quote is true.
60+
This will quote the value if needed or if quote is true. If value is a
61+
three tuple (charset, language, value), it will be encoded according
62+
to RFC2231 rules. If it contains non-ascii characters it will likewise
63+
be encoded according to RFC2231 rules, using the utf-8 charset and
64+
a null language.
6165
"""
6266
if value is not None and len(value) > 0:
6367
# A tuple is used for RFC 2231 encoded parameter values where items
@@ -67,6 +71,12 @@ def _formatparam(param, value=None, quote=True):
6771
# Encode as per RFC 2231
6872
param += '*'
6973
value = utils.encode_rfc2231(value[2], value[0], value[1])
74+
else:
75+
try:
76+
value.encode('ascii')
77+
except UnicodeEncodeError:
78+
param += '*'
79+
value = utils.encode_rfc2231(value, 'utf-8', '')
7080
# BAW: Please check this. I think that if quote is set it should
7181
# force quoting even if not necessary.
7282
if quote or tspecials.search(value):
@@ -438,11 +448,19 @@ def add_header(self, _name, _value, **_params):
438448
name is the header field to add. keyword arguments can be used to set
439449
additional parameters for the header field, with underscores converted
440450
to dashes. Normally the parameter will be added as key="value" unless
441-
value is None, in which case only the key will be added.
451+
value is None, in which case only the key will be added. If a
452+
parameter value contains non-ASCII characters it can be specified as a
453+
three-tuple of (charset, language, value), in which case it will be
454+
encoded according to RFC2231 rules. Otherwise it will be encoded using
455+
the utf-8 charset and a language of ''.
442456
443-
Example:
457+
Examples:
444458
445459
msg.add_header('content-disposition', 'attachment', filename='bud.gif')
460+
msg.add_header('content-disposition', 'attachment',
461+
filename=('utf-8', '', Fußballer.ppt'))
462+
msg.add_header('content-disposition', 'attachment',
463+
filename='Fußballer.ppt'))
446464
"""
447465
parts = []
448466
for k, v in _params.items():

Lib/email/test/test_email.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -510,6 +510,29 @@ def test_broken_base64_payload(self):
510510
self.assertEqual(msg.get_payload(decode=True),
511511
bytes(x, 'raw-unicode-escape'))
512512

513+
# Issue 1078919
514+
def test_ascii_add_header(self):
515+
msg = Message()
516+
msg.add_header('Content-Disposition', 'attachment',
517+
filename='bud.gif')
518+
self.assertEqual('attachment; filename="bud.gif"',
519+
msg['Content-Disposition'])
520+
521+
def test_noascii_add_header(self):
522+
msg = Message()
523+
msg.add_header('Content-Disposition', 'attachment',
524+
filename="Fußballer.ppt")
525+
self.assertEqual(
526+
'attachment; filename*="utf-8\'\'Fu%C3%9Fballer.ppt"',
527+
msg['Content-Disposition'])
528+
529+
def test_nonascii_add_header_via_triple(self):
530+
msg = Message()
531+
msg.add_header('Content-Disposition', 'attachment',
532+
filename=('iso-8859-1', '', 'Fußballer.ppt'))
533+
self.assertEqual(
534+
'attachment; filename*="iso-8859-1\'\'Fu%DFballer.ppt"',
535+
msg['Content-Disposition'])
513536

514537

515538
# Test the email.encoders module

Misc/NEWS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ What's New in Python 3.2 Beta 2?
1111
Library
1212
-------
1313

14+
- Issue #1078919: add_header now automatically RFC2231 encodes parameters
15+
that contain non-ascii values.
16+
1417
- Issue #10188 (partial resolution): tempfile.TemporaryDirectory emits
1518
a warning on sys.stderr rather than throwing a misleading exception
1619
if cleanup fails due to nulling out of modules during shutdown.

0 commit comments

Comments
 (0)