@@ -827,7 +827,7 @@ methods and attributes from the underlying stream.
827827Encodings and Unicode
828828---------------------
829829
830- Strings are stored internally as sequences of codepoints in
830+ Strings are stored internally as sequences of code points in
831831range ``0x0 ``-``0x10FFFF ``. (See :pep: `393 ` for
832832more details about the implementation.)
833833Once a string object is used outside of CPU and memory, endianness
@@ -838,23 +838,23 @@ There are a variety of different text serialisation codecs, which are
838838collectivity referred to as :term: `text encodings <text encoding> `.
839839
840840The simplest text encoding (called ``'latin-1' `` or ``'iso-8859-1' ``) maps
841- the codepoints 0-255 to the bytes ``0x0 ``-``0xff ``, which means that a string
842- object that contains codepoints above ``U+00FF `` can't be encoded with this
841+ the code points 0-255 to the bytes ``0x0 ``-``0xff ``, which means that a string
842+ object that contains code points above ``U+00FF `` can't be encoded with this
843843codec. Doing so will raise a :exc: `UnicodeEncodeError ` that looks
844844like the following (although the details of the error message may differ):
845845``UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
846846position 3: ordinal not in range(256) ``.
847847
848848There's another group of encodings (the so called charmap encodings) that choose
849- a different subset of all Unicode code points and how these codepoints are
849+ a different subset of all Unicode code points and how these code points are
850850mapped to the bytes ``0x0 ``-``0xff ``. To see how this is done simply open
851851e.g. :file: `encodings/cp1252.py ` (which is an encoding that is used primarily on
852852Windows). There's a string constant with 256 characters that shows you which
853853character is mapped to which byte value.
854854
855- All of these encodings can only encode 256 of the 1114112 codepoints
855+ All of these encodings can only encode 256 of the 1114112 code points
856856defined in Unicode. A simple and straightforward way that can store each Unicode
857- code point, is to store each codepoint as four consecutive bytes. There are two
857+ code point, is to store each code point as four consecutive bytes. There are two
858858possibilities: store the bytes in big endian or in little endian order. These
859859two encodings are called ``UTF-32-BE `` and ``UTF-32-LE `` respectively. Their
860860disadvantage is that if e.g. you use ``UTF-32-BE `` on a little endian machine you
0 commit comments