|
19 | 19 | |
20 | 20 | * The maintainer will go through Misc/NEWS periodically and add |
21 | 21 | changes; it's therefore more important to add your changes to |
22 | | - Misc/NEWS than to this file. |
| 22 | + Misc/NEWS than to this file. (Note: I didn't get to this for 3.0. |
| 23 | + GvR.) |
23 | 24 | |
24 | 25 | * This is not a complete list of every single change; completeness |
25 | 26 | is the purpose of Misc/NEWS. Some changes I consider too small |
|
40 | 41 | necessary (especially when a final release is some months away). |
41 | 42 | |
42 | 43 | * Credit the author of a patch or bugfix. Just the name is |
43 | | - sufficient; the e-mail address isn't necessary. |
| 44 | + sufficient; the e-mail address isn't necessary. (Due to time |
| 45 | + constraints I haven't managed to do this for 3.0. GvR.) |
44 | 46 | |
45 | 47 | * It's helpful to add the bug/patch number as a comment: |
46 | 48 | |
|
50 | 52 | (Contributed by P.Y. Developer.) |
51 | 53 | |
52 | 54 | This saves the maintainer the effort of going through the SVN log |
53 | | - when researching a change. |
| 55 | + when researching a change. (Again, I didn't get to this for 3.0. |
| 56 | + GvR.) |
54 | 57 |
|
55 | 58 | This article explains the new features in Python 3.0, compared to 2.6. |
56 | 59 | Python 3.0, also known as "Python 3000" or "Py3K", is the first ever |
@@ -157,20 +160,38 @@ XXX HIRO |
157 | 160 | always use an encoding to map between strings (in memory) and bytes |
158 | 161 | (on disk). Binary files (opened with a ``b`` in the mode argument) |
159 | 162 | always use bytes in memory. This means that if a file is opened |
160 | | - using an incorrect mode or encoding, I/O will likely fail. There is |
161 | | - a platform-dependent default encoding, which on Unixy platforms can |
162 | | - be set with the ``LANG`` environment variable (and sometimes also |
163 | | - with some other platform-specific locale-related environment |
164 | | - variables). In many cases, but not all, the system default is |
165 | | - UTF-8; you should never count on this default. Any application |
166 | | - reading or writing more than pure ASCII text should probably have a |
167 | | - way to override the encoding. |
| 163 | + using an incorrect mode or encoding, I/O will likely fail. It also |
| 164 | + means that even Unix users will have to specify the correct mode |
| 165 | + (text or binary) when opening a file. There is a platform-dependent |
| 166 | + default encoding, which on Unixy platforms can be set with the |
| 167 | + ``LANG`` environment variable (and sometimes also with some other |
| 168 | + platform-specific locale-related environment variables). In many |
| 169 | + cases, but not all, the system default is UTF-8; you should never |
| 170 | + count on this default. Any application reading or writing more than |
| 171 | + pure ASCII text should probably have a way to override the encoding. |
168 | 172 |
|
169 | 173 | * The builtin :class:`basestring` abstract type was removed. Use |
170 | 174 | :class:`str` instead. The :class:`str` and :class:`bytes` types |
171 | 175 | don't have functionality enough in common to warrant a shared base |
172 | 176 | class. |
173 | 177 |
|
| 178 | +* Filenames are passed to and returned from APIs as (Unicode) strings. |
| 179 | + This can present platform-specific problems because on some |
| 180 | + platforms filenames are arbitrary byte strings. (On the other hand |
| 181 | + on Windows, filenames are natively stored as Unicode.) As a |
| 182 | + work-around, most APIs (e.g. :func:`open` and many functions in the |
| 183 | + :mod:`os` module) that take filenames accept :class:`bytes` objects |
| 184 | + as well as strings, and a few APIs have a way to ask for a |
| 185 | + :class:`bytes` return value: :func:`os.listdir` returns a |
| 186 | + :class:`bytes` instance if the argument is a :class:`bytes` |
| 187 | + instance, and :func:`os.getcwdu` returns the current working |
| 188 | + directory as a :class:`bytes` instance. |
| 189 | + |
| 190 | +* Some system APIs like :data:`os.environ` and :data:`sys.argv` can |
| 191 | + also present problems when the bytes made available by the system is |
| 192 | + not interpretable using the default encoding. Setting the ``LANG`` |
| 193 | + variable and rerunning the program is probably the best approach. |
| 194 | + |
174 | 195 | * All backslashes in raw strings are interpreted literally. This |
175 | 196 | means that ``'\U'`` and ``'\u'`` escapes in raw strings are not |
176 | 197 | treated specially. |
@@ -439,7 +460,7 @@ consulted for longer descriptions. |
439 | 460 | start deprecating the ``%`` operator in Python 3.1. |
440 | 461 |
|
441 | 462 | * :ref:`pep-3105`. This is now a standard feature and no longer needs |
442 | | - to be imported from :mod:`__future__`. |
| 463 | + to be imported from :mod:`__future__`. More details were given above. |
443 | 464 |
|
444 | 465 | * :ref:`pep-3110`. The :keyword:`except` *exc* :keyword:`as` *var* |
445 | 466 | syntax is now standard and :keyword:`except` *exc*, *var* is no |
|
0 commit comments