Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 2270d58

Browse files
committed
Make an entry for the os module's bytes accessors.
Split codecs into a separate section. Rewrite the Unicode section.
1 parent 03ca1a9 commit 2270d58

1 file changed

Lines changed: 49 additions & 37 deletions

File tree

Doc/whatsnew/3.2.rst

Lines changed: 49 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -459,9 +459,9 @@ Some smaller changes made to the core Python language are:
459459
exceptions pass through::
460460

461461
>>> class A:
462-
@property
463-
def f(self):
464-
return 1 // 0
462+
@property
463+
def f(self):
464+
return 1 // 0
465465

466466
>>> a = A()
467467
>>> hasattr(a, 'f')
@@ -1135,6 +1135,28 @@ wrong results.
11351135

11361136
(Patch submitted by Nir Aides in :issue:`7610`.)
11371137

1138+
os
1139+
--
1140+
1141+
Different operating systems use various encodings for filenames and environment
1142+
variables. The :mod:`os` module provides two new functions,
1143+
:func:`~os.fsencode` and :func:`~os.fsdecode`, for encoding and decoding
1144+
filenames:
1145+
1146+
>>> filename = 'словарь'
1147+
>>> os.fsencode(filename)
1148+
b'\xd1\x81\xd0\xbb\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x80\xd1\x8c'
1149+
>>> open(os.fsencode(filename))
1150+
1151+
Some operating systems allow direct access to the unencoded bytes in the
1152+
environment. If so, the :attr:`os.supports_bytes_environ` constant will be
1153+
true.
1154+
1155+
For direct access to unencoded environment variables (if available),
1156+
use the new :func:`os.getenvb` function or use :data:`os.environb`
1157+
which is a bytes version of :data:`os.environ`.
1158+
1159+
11381160
shutil
11391161
------
11401162

@@ -1728,49 +1750,39 @@ multi-line arguments a bit faster (:issue:`7113` by Łukasz Langa).
17281750
Unicode
17291751
=======
17301752

1731-
Python has been updated to Unicode 6.0.0. The new features of the
1732-
Unicode Standard that will affect Python users include:
1733-
1734-
* addition of 2,088 characters, including over 1,000 additional
1735-
symbols—chief among them the additional emoji symbols, which are
1736-
especially important for mobile phones;
1753+
Python has been updated to `Unicode 6.0.0
1754+
<http://unicode.org/versions/Unicode6.0.0/>`_. The update to the standard adds
1755+
over 2,000 new characters including `emoji <http://en.wikipedia.org/wiki/Emoji>`_
1756+
symbols which are important for mobile phones.
17371757

1738-
* changes to character properties for existing characters including
1758+
In addition, the updated standard has altered the character properties for two
1759+
Kannada characters (U+0CF1, U+0CF2) and one New Tai Lue numeric character
1760+
(U+19DA), making the former eligible for use in identifiers while disqualifying
1761+
the latter. For more information, see `Unicode Character Database Changes
1762+
<http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes>`_.
17391763

1740-
- a general category change to two Kannada characters (U+0CF1,
1741-
U+0CF2), which has the effect of making them newly eligible for
1742-
inclusion in identifiers;
17431764

1744-
- a general category change to one New Tai Lue numeric character
1745-
(U+19DA), which has the effect of disqualifying it from
1746-
inclusion in identifiers.
1765+
Codecs
1766+
======
17471767

1748-
For more information, see `Unicode Character Database Changes
1749-
<http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes>`_
1750-
at the `Unicode Consortium <http://www.unicode.org/>`_ web site.
1768+
Support was added for *cp720* Arabic DOS encoding (:issue:`1616979`).
17511769

1752-
The :mod:`os` module has two new functions: :func:`~os.fsencode` and
1753-
:func:`~os.fsdecode`. Add :data:`os.environb`: bytes version of
1754-
:data:`os.environ`, :func:`os.getenvb` function and
1755-
:data:`os.supports_bytes_environ` constant.
1770+
MBCS encoding no longer ignores the error handler argument. In the default
1771+
strict mode, it raises an :exc:`UnicodeDecodeError` when it encounters an
1772+
undecodable byte sequence and an :exc:`UnicodeEncodeError` for an unencodable
1773+
character.
17561774

1757-
MBCS encoding doesn't ignore the error handler argument any more. By
1758-
default (strict mode), it raises an UnicodeDecodeError on undecodable byte
1759-
sequence and UnicodeEncodeError on unencodable character. To get the MBCS
1760-
encoding of Python 3.1, use ``'ignore'`` error handler to decode and
1761-
``'replace'`` error handler to encode. The MBCS codec supports ``'strict'`` and
1762-
``'ignore'`` error handlers for decoding, and ``'strict'`` and ``'replace'``
1763-
for encoding.
1775+
The MBCS codec supports ``'strict'`` and ``'ignore'`` error handlers for
1776+
decoding, and ``'strict'`` and ``'replace'`` for encoding.
17641777

1765-
On Mac OS X, Python uses ``'utf-8'`` to decode the command line arguments,
1766-
instead of the locale encoding (which is ISO-8859-1 if the ``LANG`` environment
1767-
variable is not set).
1778+
To emulate Python3.1 MBCS encoding, select the ``'ignore'`` handler for decoding
1779+
and the ``'replace'`` handler for encoding.
17681780

1769-
By default, tarfile uses ``'utf-8'`` encoding on Windows (instead of
1770-
``'mbcs'``), and the ``'surrogateescape'`` error handler on all operating
1771-
systems.
1781+
On Mac OS/X, Python decodes command line arguments with ``'utf-8'`` rather than
1782+
the locale encoding.
17721783

1773-
Also, support was added for *cp720* Arabic DOS encoding (:issue:`1616979`).
1784+
By default, tarfile uses ``'utf-8'`` encoding on Windows (instead of ``'mbcs'``)
1785+
and the ``'surrogateescape'`` error handler on all operating systems.
17741786

17751787

17761788
Documentation

0 commit comments

Comments
 (0)