Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 080a2c0

Browse files
committed
#16127: merge with 3.3.
2 parents b176203 + e7f9037 commit 080a2c0

4 files changed

Lines changed: 6 additions & 17 deletions

File tree

Doc/c-api/unicode.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1083,8 +1083,6 @@ These are the UTF-32 codec APIs:
10831083
After completion, *\*byteorder* is set to the current byte order at the end
10841084
of input data.
10851085
1086-
In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
1087-
10881086
If *byteorder* is *NULL*, the codec starts in native order mode.
10891087
10901088
Return *NULL* if an exception was raised by the codec.

Doc/reference/lexical_analysis.rst

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -538,9 +538,7 @@ Notes:
538538
this escape sequence. Exactly four hex digits are required.
539539

540540
(6)
541-
Any Unicode character can be encoded this way, but characters outside the Basic
542-
Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is
543-
compiled to use 16-bit code units (the default). Exactly eight hex digits
541+
Any Unicode character can be encoded this way. Exactly eight hex digits
544542
are required.
545543

546544

Include/unicodeobject.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1022,8 +1022,7 @@ PyAPI_FUNC(void*) _PyUnicode_AsKind(PyObject *s, unsigned int kind);
10221022

10231023
/* Create a Unicode Object from the given Unicode code point ordinal.
10241024
1025-
The ordinal must be in range(0x10000) on narrow Python builds
1026-
(UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is
1025+
The ordinal must be in range(0x110000). A ValueError is
10271026
raised in case it is not.
10281027
10291028
*/

Objects/unicodeobject.c

Lines changed: 4 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5800,18 +5800,12 @@ PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
58005800
void *data;
58015801
Py_ssize_t expandsize = 0;
58025802

5803-
/* Initial allocation is based on the longest-possible unichr
5803+
/* Initial allocation is based on the longest-possible character
58045804
escape.
58055805
5806-
In wide (UTF-32) builds '\U00xxxxxx' is 10 chars per source
5807-
unichr, so in this case it's the longest unichr escape. In
5808-
narrow (UTF-16) builds this is five chars per source unichr
5809-
since there are two unichrs in the surrogate pair, so in narrow
5810-
(UTF-16) builds it's not the longest unichr escape.
5811-
5812-
In wide or narrow builds '\uxxxx' is 6 chars per source unichr,
5813-
so in the narrow (UTF-16) build case it's the longest unichr
5814-
escape.
5806+
For UCS1 strings it's '\xxx', 4 bytes per source character.
5807+
For UCS2 strings it's '\uxxxx', 6 bytes per source character.
5808+
For UCS4 strings it's '\U00xxxxxx', 10 bytes per source character.
58155809
*/
58165810

58175811
if (!PyUnicode_Check(unicode)) {

0 commit comments

Comments
 (0)