@@ -14,6 +14,10 @@ The first three of these functions described, :cfunc:`PyArg_ParseTuple`,
1414strings * which are used to tell the function about the expected arguments. The
1515format strings use the same syntax for each of these functions.
1616
17+ -----------------
18+ Parsing arguments
19+ -----------------
20+
1721A format string consists of zero or more "format units." A format unit
1822describes one Python object; it is usually a single character or a parenthesized
1923sequence of format units. With a few exceptions, a format unit that is not a
@@ -23,75 +27,108 @@ unit; the entry in (round) parentheses is the Python object type that matches
2327the format unit; and the entry in [square] brackets is the type of the C
2428variable(s) whose address should be passed.
2529
26- ``s `` (string or Unicode object) [const char \* ]
27- Convert a Python string or Unicode object to a C pointer to a character string.
28- You must not provide storage for the string itself; a pointer to an existing
29- string is stored into the character pointer variable whose address you pass.
30- The C string is NUL-terminated. The Python string must not contain embedded NUL
31- bytes; if it does, a :exc: `TypeError ` exception is raised. Unicode objects are
32- converted to C strings using the default encoding. If this conversion fails, a
33- :exc: `UnicodeError ` is raised.
30+ Strings and buffers
31+ -------------------
3432
35- Starting with Python 2.5 the type of the length argument can be
36- controlled by defining the macro :cmacro: `PY_SSIZE_T_CLEAN ` before
37- including :file: `Python.h `. If the macro is defined, length is a
38- :ctype: `Py_ssize_t ` rather than an int.
39-
40- ``s* `` (string, Unicode, or any buffer compatible object) [Py_buffer]
41- This is similar to ``s ``, but the code fills a :ctype: `Py_buffer ` structure
42- provided by the caller. In this case the Python string may contain embedded
43- null bytes. Unicode objects pass back a pointer to the default encoded
44- string version of the object if such a conversion is possible. The
45- underlying buffer is locked, so that the caller can subsequently use the
46- buffer even inside a ``Py_BEGIN_ALLOW_THREADS `` block. **The caller is
47- responsible ** for calling ``PyBuffer_Release `` with the structure after it
48- has processed the data.
33+ These formats do not expect you to provide raw storage for the returned string
34+ or bytes. Also, you won't have to release any memory yourself, except with
35+ the ``es ``, ``es# ``, ``et `` and ``et# `` formats.
4936
50- ``s# `` (string, Unicode or any read buffer compatible object) [const char \* , int or :ctype: `Py_ssize_t `]
51- This variant on ``s `` stores into two C variables, the first one a pointer to
52- a character string, the second one its length. In this case the Python
53- string may contain embedded null bytes. Unicode objects pass back a pointer
54- to the default encoded string version of the object if such a conversion is
55- possible. All other read-buffer compatible objects pass back a reference to
56- the raw internal data representation. Since this format doesn't allow
57- writable buffer compatible objects like byte arrays, ``s* `` is to be
58- preferred.
59-
60- The type of the length argument (int or :ctype: `Py_ssize_t `) is controlled by
37+ However, when a :ctype: `Py_buffer ` structure gets filled, the underlying
38+ buffer is locked so that the caller can subsequently use the buffer even
39+ inside a ``Py_BEGIN_ALLOW_THREADS `` block without the risk of mutable data
40+ being resized or destroyed. As a result, **you have to call **
41+ :cfunc: `PyBuffer_Release ` after you have finished processing the data (or
42+ in any early abort case).
43+
44+ Unless otherwise stated, buffers are not NUL-terminated.
45+
46+ .. note ::
47+ For all ``# `` variants of formats (``s# ``, ``y# ``, etc.), the type of
48+ the length argument (int or :ctype: `Py_ssize_t `) is controlled by
6149 defining the macro :cmacro: `PY_SSIZE_T_CLEAN ` before including
62- :file: `Python.h `. If the macro was defined, length is a :ctype: `Py_ssize_t `
63- rather than an int. This behavior will change in a future Python version to
64- only support :ctype: `Py_ssize_t ` and drop int support. It is best to always
65- define :cmacro: `PY_SSIZE_T_CLEAN `.
50+ :file: `Python.h `. If the macro was defined, length is a
51+ :ctype: `Py_ssize_t ` rather than an int. This behavior will change
52+ in a future Python version to only support :ctype: `Py_ssize_t ` and
53+ drop int support. It is best to always define :cmacro: `PY_SSIZE_T_CLEAN `.
54+
55+
56+ ``s `` (Unicode object) [const char \* ]
57+ Convert a Unicode object to a C pointer to a character string.
58+ A pointer to an existing string is stored in the character pointer
59+ variable whose address you pass. The C string is NUL-terminated.
60+ The Python string must not contain embedded NUL bytes; if it does,
61+ a :exc: `TypeError ` exception is raised. Unicode objects are converted
62+ to C strings using the default encoding. If this conversion fails, a
63+ :exc: `UnicodeError ` is raised.
6664
67- ``y `` (bytes object) [const char \* ]
68- This variant on ``s `` converts a Python bytes or bytearray object to a C
69- pointer to a character string. The bytes object must not contain embedded
70- NUL bytes; if it does, a :exc: `TypeError ` exception is raised.
65+ .. note ::
66+ This format does not accept bytes-like objects. If you want to accept
67+ filesystem paths and convert them to C character strings, it is
68+ preferrable to use the ``O& `` format with :cfunc: `PyUnicode_FSConverter `
69+ as *converter *.
7170
72- ``y* `` (bytes object) [Py_buffer \* ]
73- This is to ``s* `` as ``y `` is to ``s ``.
71+ ``s* `` (Unicode object or any buffer compatible object) [Py_buffer]
72+ This format accepts Unicode objects as well as objects supporting the
73+ buffer protocol (such as :class: `bytes ` or :class: `bytearray ` objects).
74+ It fills a :ctype: `Py_buffer ` structure provided by the caller.
75+ Unicode objects are converted to C strings using the default encoding.
76+ In this case the resulting C string may contain embedded NUL bytes.
7477
75- ``y# `` (bytes object) [const char \* , int]
76- This variant on ``s# `` stores into two C variables, the first one a pointer
77- to a character string, the second one its length. This only accepts bytes
78- objects, no byte arrays.
78+ ``s# `` (string, Unicode or any read buffer compatible object) [const char \* , int or :ctype: `Py_ssize_t `]
79+ Like ``s* ``, except that it doesn't accept mutable buffer-like objects
80+ such as :class: `bytearray `. The result is stored into two C variables,
81+ the first one a pointer to a C string, the second one its length.
82+ The string may contain embedded null bytes.
7983
80- ``z `` (string or ``None ``) [const char \* ]
84+ ``z `` (Unicode object or ``None ``) [const char \* ]
8185 Like ``s ``, but the Python object may also be ``None ``, in which case the C
8286 pointer is set to *NULL *.
8387
84- ``z* `` (string or ``None `` or any buffer compatible object) [Py_buffer]
85- This is to ``s* `` as ``z `` is to ``s ``.
88+ ``z* `` (Unicode object or ``None `` or any buffer compatible object) [Py_buffer]
89+ Like ``s* ``, but the Python object may also be ``None ``, in which case the
90+ ``buf `` member of the :ctype: `Py_buffer ` structure is set to *NULL *.
8691
87- ``z# `` (string or ``None `` or any read buffer compatible object) [const char \* , int]
88- This is to ``s# `` as ``z `` is to ``s ``.
92+ ``z# `` (Unicode object or ``None `` or any read buffer compatible object) [const char \* , int]
93+ Like ``s# ``, but the Python object may also be ``None ``, in which case the C
94+ pointer is set to *NULL *.
95+
96+ ``y `` (bytes object) [const char \* ]
97+ This format converts a bytes-like object to a C pointer to a character
98+ string; it does not accept Unicode objects. The bytes buffer must not
99+ contain embedded NUL bytes; if it does, a :exc: `TypeError `
100+ exception is raised.
101+
102+ ``y* `` (any buffer compatible object) [Py_buffer \* ]
103+ This variant on ``s* `` doesn't accept Unicode objects, only objects
104+ supporting the buffer protocol. **This is the recommended way to accept
105+ binary data. **
106+
107+ ``y# `` (bytes object) [const char \* , int]
108+ This variant on ``s# `` doesn't accept Unicode objects, only bytes-like
109+ objects.
110+
111+ ``S `` (bytes object) [PyBytesObject \* ]
112+ Requires that the Python object is a :class: `bytes ` object, without
113+ attempting any conversion. Raises :exc: `TypeError ` if the object is not
114+ a bytes object. The C variable may also be declared as :ctype: `PyObject\* `.
115+
116+ ``Y `` (bytearray object) [PyByteArrayObject \* ]
117+ Requires that the Python object is a :class: `bytearray ` object, without
118+ attempting any conversion. Raises :exc: `TypeError ` if the object is not
119+ a bytearray object. The C variable may also be declared as :ctype: `PyObject\* `.
89120
90121``u `` (Unicode object) [Py_UNICODE \* ]
91122 Convert a Python Unicode object to a C pointer to a NUL-terminated buffer of
92- 16-bit Unicode (UTF-16) data. As with ``s ``, there is no need to provide
93- storage for the Unicode data buffer; a pointer to the existing Unicode data is
94- stored into the :ctype: `Py_UNICODE ` pointer variable whose address you pass.
123+ Unicode characters. You must pass the address of a :ctype: `Py_UNICODE `
124+ pointer variable, which will be filled with the pointer to an existing
125+ Unicode buffer. Please note that the width of a :ctype: `Py_UNICODE `
126+ character depends on compilation options (it is either 16 or 32 bits).
127+
128+ ..note ::
129+ Since ``u`` doesn't give you back the length of the string, and it
130+ may contain embedded NUL characters, it is recommended to use ``u#``
131+ or ``U`` instead.
95132
96133``u# `` (Unicode object) [Py_UNICODE \* , int]
97134 This variant on ``u `` stores into two C variables, the first one a pointer to a
@@ -100,11 +137,40 @@ variable(s) whose address should be passed.
100137 array.
101138
102139``Z `` (Unicode or ``None ``) [Py_UNICODE \* ]
103- Like ``s ``, but the Python object may also be ``None ``, in which case the C
104- pointer is set to *NULL *.
140+ Like ``u ``, but the Python object may also be ``None ``, in which case the
141+ :ctype: ` Py_UNICODE ` pointer is set to *NULL *.
105142
106143``Z# `` (Unicode or ``None ``) [Py_UNICODE \* , int]
107- This is to ``u# `` as ``Z `` is to ``u ``.
144+ Like ``u# ``, but the Python object may also be ``None ``, in which case the
145+ :ctype: `Py_UNICODE ` pointer is set to *NULL *.
146+
147+ ``U `` (Unicode object) [PyUnicodeObject \* ]
148+ Requires that the Python object is a Unicode object, without attempting
149+ any conversion. Raises :exc: `TypeError ` if the object is not a Unicode
150+ object. The C variable may also be declared as :ctype: `PyObject\* `.
151+
152+ ``t# `` (read-only character buffer) [char \* , int]
153+ Like ``s# ``, but accepts any object which implements the read-only buffer
154+ interface. The :ctype: `char\* ` variable is set to point to the first byte of
155+ the buffer, and the :ctype: `int ` is set to the length of the buffer. Only
156+ single-segment buffer objects are accepted; :exc: `TypeError ` is raised for all
157+ others.
158+
159+ ``w `` (read-write character buffer) [char \* ]
160+ Similar to ``s ``, but accepts any object which implements the read-write buffer
161+ interface. The caller must determine the length of the buffer by other means,
162+ or use ``w# `` instead. Only single-segment buffer objects are accepted;
163+ :exc: `TypeError ` is raised for all others.
164+
165+ ``w* `` (read-write byte-oriented buffer) [Py_buffer]
166+ This is to ``w `` what ``s* `` is to ``s ``.
167+
168+ ``w# `` (read-write character buffer) [char \* , int]
169+ Like ``s# ``, but accepts any object which implements the read-write buffer
170+ interface. The :ctype: `char \* ` variable is set to point to the first byte
171+ of the buffer, and the :ctype: `int ` is set to the length of the buffer.
172+ Only single-segment buffer objects are accepted; :exc: `TypeError ` is raised
173+ for all others.
108174
109175``es `` (string, Unicode object or character buffer compatible object) [const char \* encoding, char \*\* buffer]
110176 This variant on ``s `` is used for encoding Unicode and objects convertible to
@@ -165,6 +231,9 @@ variable(s) whose address should be passed.
165231 them. Instead, the implementation assumes that the string object uses the
166232 encoding passed in as parameter.
167233
234+ Numbers
235+ -------
236+
168237``b `` (integer) [unsigned char]
169238 Convert a nonnegative Python integer to an unsigned tiny int, stored in a C
170239 :ctype: `unsigned char `.
@@ -207,13 +276,13 @@ variable(s) whose address should be passed.
207276``n `` (integer) [Py_ssize_t]
208277 Convert a Python integer to a C :ctype: `Py_ssize_t `.
209278
210- ``c `` (string of length 1) [char]
211- Convert a Python character , represented as a byte string of length 1, to a C
212- :ctype: `char `.
279+ ``c `` (bytes object of length 1) [char]
280+ Convert a Python byte , represented as a :class: ` bytes ` object of length 1,
281+ to a C :ctype: `char `.
213282
214- ``C `` (string of length 1) [int]
215- Convert a Python character, represented as a unicode string of length 1, to a
216- C :ctype: `int `.
283+ ``C `` (Unicode object of length 1) [int]
284+ Convert a Python character, represented as a :class: ` str `: object of
285+ length 1, to a C :ctype: `int `.
217286
218287``f `` (float) [float]
219288 Convert a Python floating point number to a C :ctype: `float `.
@@ -224,6 +293,9 @@ variable(s) whose address should be passed.
224293``D `` (complex) [Py_complex]
225294 Convert a Python complex number to a C :ctype: `Py_complex ` structure.
226295
296+ Other objects
297+ -------------
298+
227299``O `` (object) [PyObject \* ]
228300 Store a Python object (without any conversion) in a C object pointer. The C
229301 program thus receives the actual object that was passed. The object's reference
@@ -258,39 +330,6 @@ variable(s) whose address should be passed.
258330 .. versionchanged :: 3.1
259331 Py_CLEANUP_SUPPORTED was added.
260332
261- ``S `` (string) [PyStringObject \* ]
262- Like ``O `` but requires that the Python object is a string object. Raises
263- :exc: `TypeError ` if the object is not a string object. The C variable may also
264- be declared as :ctype: `PyObject\* `.
265-
266- ``U `` (Unicode string) [PyUnicodeObject \* ]
267- Like ``O `` but requires that the Python object is a Unicode object. Raises
268- :exc: `TypeError ` if the object is not a Unicode object. The C variable may also
269- be declared as :ctype: `PyObject\* `.
270-
271- ``t# `` (read-only character buffer) [char \* , int]
272- Like ``s# ``, but accepts any object which implements the read-only buffer
273- interface. The :ctype: `char\* ` variable is set to point to the first byte of
274- the buffer, and the :ctype: `int ` is set to the length of the buffer. Only
275- single-segment buffer objects are accepted; :exc: `TypeError ` is raised for all
276- others.
277-
278- ``w `` (read-write character buffer) [char \* ]
279- Similar to ``s ``, but accepts any object which implements the read-write buffer
280- interface. The caller must determine the length of the buffer by other means,
281- or use ``w# `` instead. Only single-segment buffer objects are accepted;
282- :exc: `TypeError ` is raised for all others.
283-
284- ``w* `` (read-write byte-oriented buffer) [Py_buffer]
285- This is to ``w `` what ``s* `` is to ``s ``.
286-
287- ``w# `` (read-write character buffer) [char \* , int]
288- Like ``s# ``, but accepts any object which implements the read-write buffer
289- interface. The :ctype: `char \* ` variable is set to point to the first byte
290- of the buffer, and the :ctype: `int ` is set to the length of the buffer.
291- Only single-segment buffer objects are accepted; :exc: `TypeError ` is raised
292- for all others.
293-
294333``(items) `` (tuple) [*matching-items *]
295334 The object must be a Python sequence whose length is the number of format units
296335 in *items *. The C arguments must correspond to the individual format units in
@@ -339,6 +378,8 @@ false and raise an appropriate exception. When the
339378of the format units, the variables at the addresses corresponding to that
340379and the following format units are left untouched.
341380
381+ API Functions
382+ -------------
342383
343384.. cfunction :: int PyArg_ParseTuple(PyObject *args, const char *format, ...)
344385
@@ -415,6 +456,10 @@ and the following format units are left untouched.
415456 PyArg_ParseTuple(args, "O|O:ref", &object, &callback)
416457
417458
459+ ---------------
460+ Building values
461+ ---------------
462+
418463.. cfunction :: PyObject* Py_BuildValue(const char *format, ...)
419464
420465 Create a new value based on a format string similar to those accepted by the
0 commit comments