Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 95cd91c

Browse files
committed
#11840: Improve c-api/unicode documentation. Patch by Sandro Tosi.
1 parent 832c8bb commit 95cd91c

1 file changed

Lines changed: 27 additions & 29 deletions

File tree

Doc/c-api/unicode.rst

Lines changed: 27 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -329,8 +329,8 @@ APIs:
329329
incremented refcount.
330330

331331
:class:`bytes`, :class:`bytearray` and other char buffer compatible objects
332-
are decoded according to the given encoding and using the error handling
333-
defined by errors. Both can be *NULL* to have the interface use the default
332+
are decoded according to the given *encoding* and using the error handling
333+
defined by *errors*. Both can be *NULL* to have the interface use the default
334334
values (see the next section for details).
335335

336336
All other objects, including Unicode objects, cause a :exc:`TypeError` to be
@@ -390,12 +390,12 @@ used, passing :cfunc:`PyUnicode_FSConverter` as the conversion function:
390390
wchar_t Support
391391
"""""""""""""""
392392

393-
wchar_t support for platforms which support it:
393+
:ctype:`wchar_t` support for platforms which support it:
394394

395395
.. cfunction:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
396396

397-
Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given size.
398-
Passing -1 as the size indicates that the function must itself compute the length,
397+
Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given *size*.
398+
Passing -1 as the *size* indicates that the function must itself compute the length,
399399
using wcslen.
400400
Return *NULL* on failure.
401401

@@ -419,15 +419,15 @@ Built-in Codecs
419419
Python provides a set of built-in codecs which are written in C for speed. All of
420420
these codecs are directly usable via the following functions.
421421

422-
Many of the following APIs take two arguments encoding and errors. These
423-
parameters encoding and errors have the same semantics as the ones of the
424-
built-in :func:`str` string object constructor.
422+
Many of the following APIs take two arguments encoding and errors, and they
423+
have the same semantics as the ones of the built-in :func:`str` string object
424+
constructor.
425425

426426
Setting encoding to *NULL* causes the default encoding to be used
427427
which is ASCII. The file system calls should use
428428
:cfunc:`PyUnicode_FSConverter` for encoding file names. This uses the
429429
variable :cdata:`Py_FileSystemDefaultEncoding` internally. This
430-
variable should be treated as read-only: On some systems, it will be a
430+
variable should be treated as read-only: on some systems, it will be a
431431
pointer to a static string, on others, it will change at run-time
432432
(such as when the application invokes setlocale).
433433

@@ -456,7 +456,7 @@ These are the generic codec APIs:
456456

457457
.. cfunction:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
458458

459-
Encode the :ctype:`Py_UNICODE` buffer of the given size and return a Python
459+
Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* and return a Python
460460
bytes object. *encoding* and *errors* have the same meaning as the
461461
parameters of the same name in the Unicode :meth:`encode` method. The codec
462462
to be used is looked up using the Python codec registry. Return *NULL* if an
@@ -494,7 +494,7 @@ These are the UTF-8 codec APIs:
494494

495495
.. cfunction:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
496496

497-
Encode the :ctype:`Py_UNICODE` buffer of the given size using UTF-8 and
497+
Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
498498
return a Python bytes object. Return *NULL* if an exception was raised by
499499
the codec.
500500

@@ -514,7 +514,7 @@ These are the UTF-32 codec APIs:
514514

515515
.. cfunction:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
516516

517-
Decode *length* bytes from a UTF-32 encoded buffer string and return the
517+
Decode *size* bytes from a UTF-32 encoded buffer string and return the
518518
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
519519
handling. It defaults to "strict".
520520

@@ -582,7 +582,7 @@ These are the UTF-16 codec APIs:
582582

583583
.. cfunction:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
584584

585-
Decode *length* bytes from a UTF-16 encoded buffer string and return the
585+
Decode *size* bytes from a UTF-16 encoded buffer string and return the
586586
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
587587
handling. It defaults to "strict".
588588

@@ -714,7 +714,7 @@ These are the "Raw Unicode Escape" codec APIs:
714714

715715
.. cfunction:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
716716

717-
Encode the :ctype:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape
717+
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
718718
and return a Python string object. Return *NULL* if an exception was raised by
719719
the codec.
720720

@@ -741,7 +741,7 @@ ordinals and only these are accepted by the codecs during encoding.
741741

742742
.. cfunction:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
743743

744-
Encode the :ctype:`Py_UNICODE` buffer of the given size using Latin-1 and
744+
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Latin-1 and
745745
return a Python bytes object. Return *NULL* if an exception was raised by
746746
the codec.
747747

@@ -768,7 +768,7 @@ codes generate errors.
768768

769769
.. cfunction:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
770770

771-
Encode the :ctype:`Py_UNICODE` buffer of the given size using ASCII and
771+
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using ASCII and
772772
return a Python bytes object. Return *NULL* if an exception was raised by
773773
the codec.
774774

@@ -783,8 +783,6 @@ codes generate errors.
783783
Character Map Codecs
784784
""""""""""""""""""""
785785

786-
These are the mapping codec APIs:
787-
788786
This codec is special in that it can be used to implement many different codecs
789787
(and this is in fact what was done to obtain most of the standard codecs
790788
included in the :mod:`encodings` package). The codec uses mapping to encode and
@@ -806,6 +804,7 @@ meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal
806804
resp. Because of this, mappings only need to contain those mappings which map
807805
characters to different code points.
808806

807+
These are the mapping codec APIs:
809808

810809
.. cfunction:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
811810

@@ -819,7 +818,7 @@ characters to different code points.
819818

820819
.. cfunction:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
821820

822-
Encode the :ctype:`Py_UNICODE` buffer of the given size using the given
821+
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using the given
823822
*mapping* object and return a Python string object. Return *NULL* if an
824823
exception was raised by the codec.
825824

@@ -835,7 +834,7 @@ The following codec API is special in that maps Unicode to Unicode.
835834

836835
.. cfunction:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
837836

838-
Translate a :ctype:`Py_UNICODE` buffer of the given length by applying a
837+
Translate a :ctype:`Py_UNICODE` buffer of the given *size* by applying a
839838
character mapping *table* to it and return the resulting Unicode object. Return
840839
*NULL* when an exception was raised by the codec.
841840

@@ -847,16 +846,15 @@ The following codec API is special in that maps Unicode to Unicode.
847846
:exc:`LookupError`) are left untouched and are copied as-is.
848847

849848

849+
MBCS codecs for Windows
850+
"""""""""""""""""""""""
851+
850852
These are the MBCS codec APIs. They are currently only available on Windows and
851853
use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
852854
DBCS) is a class of encodings, not just one. The target encoding is defined by
853855
the user settings on the machine running the codec.
854856

855857

856-
MBCS codecs for Windows
857-
"""""""""""""""""""""""
858-
859-
860858
.. cfunction:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors)
861859

862860
Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
@@ -873,7 +871,7 @@ MBCS codecs for Windows
873871

874872
.. cfunction:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
875873

876-
Encode the :ctype:`Py_UNICODE` buffer of the given size using MBCS and return
874+
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using MBCS and return
877875
a Python bytes object. Return *NULL* if an exception was raised by the
878876
codec.
879877

@@ -908,7 +906,7 @@ They all return *NULL* or ``-1`` if an exception occurs.
908906

909907
.. cfunction:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit)
910908

911-
Split a string giving a list of Unicode strings. If sep is *NULL*, splitting
909+
Split a string giving a list of Unicode strings. If *sep* is *NULL*, splitting
912910
will be done at all whitespace substrings. Otherwise, splits occur at the given
913911
separator. At most *maxsplit* splits will be done. If negative, no limit is
914912
set. Separators are not included in the resulting list.
@@ -939,20 +937,20 @@ They all return *NULL* or ``-1`` if an exception occurs.
939937

940938
.. cfunction:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
941939

942-
Join a sequence of strings using the given separator and return the resulting
940+
Join a sequence of strings using the given *separator* and return the resulting
943941
Unicode string.
944942

945943

946944
.. cfunction:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
947945

948-
Return 1 if *substr* matches *str*[*start*:*end*] at the given tail end
946+
Return 1 if *substr* matches ``str[start:end]`` at the given tail end
949947
(*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match),
950948
0 otherwise. Return ``-1`` if an error occurred.
951949

952950

953951
.. cfunction:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
954952

955-
Return the first position of *substr* in *str*[*start*:*end*] using the given
953+
Return the first position of *substr* in ``str[start:end]`` using the given
956954
*direction* (*direction* == 1 means to do a forward search, *direction* == -1 a
957955
backward search). The return value is the index of the first match; a value of
958956
``-1`` indicates that no match was found, and ``-2`` indicates that an error

0 commit comments

Comments
 (0)