Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit d3faf43

Browse files
Issue #23181: More "codepoint" -> "code point".
1 parent b2653b3 commit d3faf43

24 files changed

Lines changed: 46 additions & 46 deletions

Doc/c-api/unicode.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1134,7 +1134,7 @@ These are the UTF-32 codec APIs:
11341134
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
11351135
11361136
If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
1137-
as a single codepoint.
1137+
as a single code point.
11381138
11391139
Return *NULL* if an exception was raised by the codec.
11401140

Doc/library/codecs.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -827,7 +827,7 @@ methods and attributes from the underlying stream.
827827
Encodings and Unicode
828828
---------------------
829829

830-
Strings are stored internally as sequences of codepoints in
830+
Strings are stored internally as sequences of code points in
831831
range ``0x0``-``0x10FFFF``. (See :pep:`393` for
832832
more details about the implementation.)
833833
Once a string object is used outside of CPU and memory, endianness
@@ -838,23 +838,23 @@ There are a variety of different text serialisation codecs, which are
838838
collectivity referred to as :term:`text encodings <text encoding>`.
839839

840840
The simplest text encoding (called ``'latin-1'`` or ``'iso-8859-1'``) maps
841-
the codepoints 0-255 to the bytes ``0x0``-``0xff``, which means that a string
842-
object that contains codepoints above ``U+00FF`` can't be encoded with this
841+
the code points 0-255 to the bytes ``0x0``-``0xff``, which means that a string
842+
object that contains code points above ``U+00FF`` can't be encoded with this
843843
codec. Doing so will raise a :exc:`UnicodeEncodeError` that looks
844844
like the following (although the details of the error message may differ):
845845
``UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
846846
position 3: ordinal not in range(256)``.
847847

848848
There's another group of encodings (the so called charmap encodings) that choose
849-
a different subset of all Unicode code points and how these codepoints are
849+
a different subset of all Unicode code points and how these code points are
850850
mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
851851
e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on
852852
Windows). There's a string constant with 256 characters that shows you which
853853
character is mapped to which byte value.
854854

855-
All of these encodings can only encode 256 of the 1114112 codepoints
855+
All of these encodings can only encode 256 of the 1114112 code points
856856
defined in Unicode. A simple and straightforward way that can store each Unicode
857-
code point, is to store each codepoint as four consecutive bytes. There are two
857+
code point, is to store each code point as four consecutive bytes. There are two
858858
possibilities: store the bytes in big endian or in little endian order. These
859859
two encodings are called ``UTF-32-BE`` and ``UTF-32-LE`` respectively. Their
860860
disadvantage is that if e.g. you use ``UTF-32-BE`` on a little endian machine you

Doc/library/email.mime.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ Here are the classes:
194194
minor type and defaults to :mimetype:`plain`. *_charset* is the character
195195
set of the text and is passed as an argument to the
196196
:class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults
197-
to ``us-ascii`` if the string contains only ``ascii`` codepoints, and
197+
to ``us-ascii`` if the string contains only ``ascii`` code points, and
198198
``utf-8`` otherwise.
199199

200200
Unless the *_charset* argument is explicitly set to ``None``, the

Doc/library/functions.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ are always available. They are listed here in alphabetical order.
156156

157157
.. function:: chr(i)
158158

159-
Return the string representing a character whose Unicode codepoint is the integer
159+
Return the string representing a character whose Unicode code point is the integer
160160
*i*. For example, ``chr(97)`` returns the string ``'a'``. This is the
161161
inverse of :func:`ord`. The valid range for the argument is from 0 through
162162
1,114,111 (0x10FFFF in base 16). :exc:`ValueError` will be raised if *i* is

Doc/library/html.entities.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,12 +33,12 @@ This module defines four dictionaries, :data:`html5`,
3333

3434
.. data:: name2codepoint
3535

36-
A dictionary that maps HTML entity names to the Unicode codepoints.
36+
A dictionary that maps HTML entity names to the Unicode code points.
3737

3838

3939
.. data:: codepoint2name
4040

41-
A dictionary that maps Unicode codepoints to HTML entity names.
41+
A dictionary that maps Unicode code points to HTML entity names.
4242

4343

4444
.. rubric:: Footnotes

Doc/library/json.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -512,7 +512,7 @@ The RFC does not explicitly forbid JSON strings which contain byte sequences
512512
that don't correspond to valid Unicode characters (e.g. unpaired UTF-16
513513
surrogates), but it does note that they may cause interoperability problems.
514514
By default, this module accepts and outputs (when present in the original
515-
:class:`str`) codepoints for such sequences.
515+
:class:`str`) code points for such sequences.
516516

517517

518518
Infinite and NaN Number Values

Doc/tutorial/datastructures.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -684,7 +684,7 @@ the same type, the lexicographical comparison is carried out recursively. If
684684
all items of two sequences compare equal, the sequences are considered equal.
685685
If one sequence is an initial sub-sequence of the other, the shorter sequence is
686686
the smaller (lesser) one. Lexicographical ordering for strings uses the Unicode
687-
codepoint number to order individual characters. Some examples of comparisons
687+
code point number to order individual characters. Some examples of comparisons
688688
between sequences of the same type::
689689

690690
(1, 2, 3) < (1, 2, 4)

Doc/whatsnew/3.3.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ Functionality
228228

229229
Changes introduced by :pep:`393` are the following:
230230

231-
* Python now always supports the full range of Unicode codepoints, including
231+
* Python now always supports the full range of Unicode code points, including
232232
non-BMP ones (i.e. from ``U+0000`` to ``U+10FFFF``). The distinction between
233233
narrow and wide builds no longer exists and Python now behaves like a wide
234234
build, even under Windows.
@@ -246,7 +246,7 @@ Changes introduced by :pep:`393` are the following:
246246
so ``'\U0010FFFF'[0]`` now returns ``'\U0010FFFF'`` and not ``'\uDBFF'``;
247247

248248
* all other functions in the standard library now correctly handle
249-
non-BMP codepoints.
249+
non-BMP code points.
250250

251251
* The value of :data:`sys.maxunicode` is now always ``1114111`` (``0x10FFFF``
252252
in hexadecimal). The :c:func:`PyUnicode_GetMax` function still returns
@@ -258,13 +258,13 @@ Changes introduced by :pep:`393` are the following:
258258
Performance and resource usage
259259
------------------------------
260260

261-
The storage of Unicode strings now depends on the highest codepoint in the string:
261+
The storage of Unicode strings now depends on the highest code point in the string:
262262

263-
* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per codepoint;
263+
* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;
264264

265-
* BMP strings (``U+0000-U+FFFF``) use 2 bytes per codepoint;
265+
* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
266266

267-
* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per codepoint.
267+
* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.
268268

269269
The net effect is that for most applications, memory usage of string
270270
storage should decrease significantly - especially compared to former

Lib/codecs.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ class Codec:
123123
Python will use the official U+FFFD REPLACEMENT
124124
CHARACTER for the builtin Unicode codecs on
125125
decoding and '?' on encoding.
126-
'surrogateescape' - replace with private codepoints U+DCnn.
126+
'surrogateescape' - replace with private code points U+DCnn.
127127
'xmlcharrefreplace' - Replace with the appropriate XML
128128
character reference (only for encoding).
129129
'backslashreplace' - Replace with backslashed escape sequences

Lib/email/message.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -273,7 +273,7 @@ def get_payload(self, i=None, decode=False):
273273
bpayload = payload.encode('ascii')
274274
except UnicodeError:
275275
# This won't happen for RFC compliant messages (messages
276-
# containing only ASCII codepoints in the unicode input).
276+
# containing only ASCII code points in the unicode input).
277277
# If it does happen, turn the string into bytes in a way
278278
# guaranteed not to fail.
279279
bpayload = payload.encode('raw-unicode-escape')

0 commit comments

Comments
 (0)