Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 363b79e

Browse files
committed
Merged revisions 80714 via svnmerge from
svn+ssh://[email protected]/python/branches/py3k ........ r80714 | antoine.pitrou | 2010-05-03 17:57:23 +0200 (lun., 03 mai 2010) | 3 lines Issue #8593: Fix, reorder and improve the documentation for argument parsing ........
1 parent e7bb781 commit 363b79e

1 file changed

Lines changed: 142 additions & 97 deletions

File tree

Doc/c-api/arg.rst

Lines changed: 142 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@ The first three of these functions described, :cfunc:`PyArg_ParseTuple`,
1414
strings* which are used to tell the function about the expected arguments. The
1515
format strings use the same syntax for each of these functions.
1616

17+
-----------------
18+
Parsing arguments
19+
-----------------
20+
1721
A format string consists of zero or more "format units." A format unit
1822
describes one Python object; it is usually a single character or a parenthesized
1923
sequence of format units. With a few exceptions, a format unit that is not a
@@ -23,75 +27,108 @@ unit; the entry in (round) parentheses is the Python object type that matches
2327
the format unit; and the entry in [square] brackets is the type of the C
2428
variable(s) whose address should be passed.
2529

26-
``s`` (string or Unicode object) [const char \*]
27-
Convert a Python string or Unicode object to a C pointer to a character string.
28-
You must not provide storage for the string itself; a pointer to an existing
29-
string is stored into the character pointer variable whose address you pass.
30-
The C string is NUL-terminated. The Python string must not contain embedded NUL
31-
bytes; if it does, a :exc:`TypeError` exception is raised. Unicode objects are
32-
converted to C strings using the default encoding. If this conversion fails, a
33-
:exc:`UnicodeError` is raised.
30+
Strings and buffers
31+
-------------------
3432

35-
Starting with Python 2.5 the type of the length argument can be
36-
controlled by defining the macro :cmacro:`PY_SSIZE_T_CLEAN` before
37-
including :file:`Python.h`. If the macro is defined, length is a
38-
:ctype:`Py_ssize_t` rather than an int.
39-
40-
``s*`` (string, Unicode, or any buffer compatible object) [Py_buffer]
41-
This is similar to ``s``, but the code fills a :ctype:`Py_buffer` structure
42-
provided by the caller. In this case the Python string may contain embedded
43-
null bytes. Unicode objects pass back a pointer to the default encoded
44-
string version of the object if such a conversion is possible. The
45-
underlying buffer is locked, so that the caller can subsequently use the
46-
buffer even inside a ``Py_BEGIN_ALLOW_THREADS`` block. **The caller is
47-
responsible** for calling ``PyBuffer_Release`` with the structure after it
48-
has processed the data.
33+
These formats do not expect you to provide raw storage for the returned string
34+
or bytes. Also, you won't have to release any memory yourself, except with
35+
the ``es``, ``es#``, ``et`` and ``et#`` formats.
4936

50-
``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int or :ctype:`Py_ssize_t`]
51-
This variant on ``s`` stores into two C variables, the first one a pointer to
52-
a character string, the second one its length. In this case the Python
53-
string may contain embedded null bytes. Unicode objects pass back a pointer
54-
to the default encoded string version of the object if such a conversion is
55-
possible. All other read-buffer compatible objects pass back a reference to
56-
the raw internal data representation. Since this format doesn't allow
57-
writable buffer compatible objects like byte arrays, ``s*`` is to be
58-
preferred.
59-
60-
The type of the length argument (int or :ctype:`Py_ssize_t`) is controlled by
37+
However, when a :ctype:`Py_buffer` structure gets filled, the underlying
38+
buffer is locked so that the caller can subsequently use the buffer even
39+
inside a ``Py_BEGIN_ALLOW_THREADS`` block without the risk of mutable data
40+
being resized or destroyed. As a result, **you have to call**
41+
:cfunc:`PyBuffer_Release` after you have finished processing the data (or
42+
in any early abort case).
43+
44+
Unless otherwise stated, buffers are not NUL-terminated.
45+
46+
.. note::
47+
For all ``#`` variants of formats (``s#``, ``y#``, etc.), the type of
48+
the length argument (int or :ctype:`Py_ssize_t`) is controlled by
6149
defining the macro :cmacro:`PY_SSIZE_T_CLEAN` before including
62-
:file:`Python.h`. If the macro was defined, length is a :ctype:`Py_ssize_t`
63-
rather than an int. This behavior will change in a future Python version to
64-
only support :ctype:`Py_ssize_t` and drop int support. It is best to always
65-
define :cmacro:`PY_SSIZE_T_CLEAN`.
50+
:file:`Python.h`. If the macro was defined, length is a
51+
:ctype:`Py_ssize_t` rather than an int. This behavior will change
52+
in a future Python version to only support :ctype:`Py_ssize_t` and
53+
drop int support. It is best to always define :cmacro:`PY_SSIZE_T_CLEAN`.
54+
55+
56+
``s`` (Unicode object) [const char \*]
57+
Convert a Unicode object to a C pointer to a character string.
58+
A pointer to an existing string is stored in the character pointer
59+
variable whose address you pass. The C string is NUL-terminated.
60+
The Python string must not contain embedded NUL bytes; if it does,
61+
a :exc:`TypeError` exception is raised. Unicode objects are converted
62+
to C strings using the default encoding. If this conversion fails, a
63+
:exc:`UnicodeError` is raised.
6664

67-
``y`` (bytes object) [const char \*]
68-
This variant on ``s`` converts a Python bytes or bytearray object to a C
69-
pointer to a character string. The bytes object must not contain embedded
70-
NUL bytes; if it does, a :exc:`TypeError` exception is raised.
65+
.. note::
66+
This format does not accept bytes-like objects. If you want to accept
67+
filesystem paths and convert them to C character strings, it is
68+
preferrable to use the ``O&`` format with :cfunc:`PyUnicode_FSConverter`
69+
as *converter*.
7170

72-
``y*`` (bytes object) [Py_buffer \*]
73-
This is to ``s*`` as ``y`` is to ``s``.
71+
``s*`` (Unicode object or any buffer compatible object) [Py_buffer]
72+
This format accepts Unicode objects as well as objects supporting the
73+
buffer protocol (such as :class:`bytes` or :class:`bytearray` objects).
74+
It fills a :ctype:`Py_buffer` structure provided by the caller.
75+
Unicode objects are converted to C strings using the default encoding.
76+
In this case the resulting C string may contain embedded NUL bytes.
7477

75-
``y#`` (bytes object) [const char \*, int]
76-
This variant on ``s#`` stores into two C variables, the first one a pointer
77-
to a character string, the second one its length. This only accepts bytes
78-
objects, no byte arrays.
78+
``s#`` (string, Unicode or any read buffer compatible object) [const char \*, int or :ctype:`Py_ssize_t`]
79+
Like ``s*``, except that it doesn't accept mutable buffer-like objects
80+
such as :class:`bytearray`. The result is stored into two C variables,
81+
the first one a pointer to a C string, the second one its length.
82+
The string may contain embedded null bytes.
7983

80-
``z`` (string or ``None``) [const char \*]
84+
``z`` (Unicode object or ``None``) [const char \*]
8185
Like ``s``, but the Python object may also be ``None``, in which case the C
8286
pointer is set to *NULL*.
8387

84-
``z*`` (string or ``None`` or any buffer compatible object) [Py_buffer]
85-
This is to ``s*`` as ``z`` is to ``s``.
88+
``z*`` (Unicode object or ``None`` or any buffer compatible object) [Py_buffer]
89+
Like ``s*``, but the Python object may also be ``None``, in which case the
90+
``buf`` member of the :ctype:`Py_buffer` structure is set to *NULL*.
8691

87-
``z#`` (string or ``None`` or any read buffer compatible object) [const char \*, int]
88-
This is to ``s#`` as ``z`` is to ``s``.
92+
``z#`` (Unicode object or ``None`` or any read buffer compatible object) [const char \*, int]
93+
Like ``s#``, but the Python object may also be ``None``, in which case the C
94+
pointer is set to *NULL*.
95+
96+
``y`` (bytes object) [const char \*]
97+
This format converts a bytes-like object to a C pointer to a character
98+
string; it does not accept Unicode objects. The bytes buffer must not
99+
contain embedded NUL bytes; if it does, a :exc:`TypeError`
100+
exception is raised.
101+
102+
``y*`` (any buffer compatible object) [Py_buffer \*]
103+
This variant on ``s*`` doesn't accept Unicode objects, only objects
104+
supporting the buffer protocol. **This is the recommended way to accept
105+
binary data.**
106+
107+
``y#`` (bytes object) [const char \*, int]
108+
This variant on ``s#`` doesn't accept Unicode objects, only bytes-like
109+
objects.
110+
111+
``S`` (bytes object) [PyBytesObject \*]
112+
Requires that the Python object is a :class:`bytes` object, without
113+
attempting any conversion. Raises :exc:`TypeError` if the object is not
114+
a bytes object. The C variable may also be declared as :ctype:`PyObject\*`.
115+
116+
``Y`` (bytearray object) [PyByteArrayObject \*]
117+
Requires that the Python object is a :class:`bytearray` object, without
118+
attempting any conversion. Raises :exc:`TypeError` if the object is not
119+
a bytearray object. The C variable may also be declared as :ctype:`PyObject\*`.
89120

90121
``u`` (Unicode object) [Py_UNICODE \*]
91122
Convert a Python Unicode object to a C pointer to a NUL-terminated buffer of
92-
16-bit Unicode (UTF-16) data. As with ``s``, there is no need to provide
93-
storage for the Unicode data buffer; a pointer to the existing Unicode data is
94-
stored into the :ctype:`Py_UNICODE` pointer variable whose address you pass.
123+
Unicode characters. You must pass the address of a :ctype:`Py_UNICODE`
124+
pointer variable, which will be filled with the pointer to an existing
125+
Unicode buffer. Please note that the width of a :ctype:`Py_UNICODE`
126+
character depends on compilation options (it is either 16 or 32 bits).
127+
128+
..note ::
129+
Since ``u`` doesn't give you back the length of the string, and it
130+
may contain embedded NUL characters, it is recommended to use ``u#``
131+
or ``U`` instead.
95132

96133
``u#`` (Unicode object) [Py_UNICODE \*, int]
97134
This variant on ``u`` stores into two C variables, the first one a pointer to a
@@ -100,11 +137,40 @@ variable(s) whose address should be passed.
100137
array.
101138

102139
``Z`` (Unicode or ``None``) [Py_UNICODE \*]
103-
Like ``s``, but the Python object may also be ``None``, in which case the C
104-
pointer is set to *NULL*.
140+
Like ``u``, but the Python object may also be ``None``, in which case the
141+
:ctype:`Py_UNICODE` pointer is set to *NULL*.
105142

106143
``Z#`` (Unicode or ``None``) [Py_UNICODE \*, int]
107-
This is to ``u#`` as ``Z`` is to ``u``.
144+
Like ``u#``, but the Python object may also be ``None``, in which case the
145+
:ctype:`Py_UNICODE` pointer is set to *NULL*.
146+
147+
``U`` (Unicode object) [PyUnicodeObject \*]
148+
Requires that the Python object is a Unicode object, without attempting
149+
any conversion. Raises :exc:`TypeError` if the object is not a Unicode
150+
object. The C variable may also be declared as :ctype:`PyObject\*`.
151+
152+
``t#`` (read-only character buffer) [char \*, int]
153+
Like ``s#``, but accepts any object which implements the read-only buffer
154+
interface. The :ctype:`char\*` variable is set to point to the first byte of
155+
the buffer, and the :ctype:`int` is set to the length of the buffer. Only
156+
single-segment buffer objects are accepted; :exc:`TypeError` is raised for all
157+
others.
158+
159+
``w`` (read-write character buffer) [char \*]
160+
Similar to ``s``, but accepts any object which implements the read-write buffer
161+
interface. The caller must determine the length of the buffer by other means,
162+
or use ``w#`` instead. Only single-segment buffer objects are accepted;
163+
:exc:`TypeError` is raised for all others.
164+
165+
``w*`` (read-write byte-oriented buffer) [Py_buffer]
166+
This is to ``w`` what ``s*`` is to ``s``.
167+
168+
``w#`` (read-write character buffer) [char \*, int]
169+
Like ``s#``, but accepts any object which implements the read-write buffer
170+
interface. The :ctype:`char \*` variable is set to point to the first byte
171+
of the buffer, and the :ctype:`int` is set to the length of the buffer.
172+
Only single-segment buffer objects are accepted; :exc:`TypeError` is raised
173+
for all others.
108174

109175
``es`` (string, Unicode object or character buffer compatible object) [const char \*encoding, char \*\*buffer]
110176
This variant on ``s`` is used for encoding Unicode and objects convertible to
@@ -165,6 +231,9 @@ variable(s) whose address should be passed.
165231
them. Instead, the implementation assumes that the string object uses the
166232
encoding passed in as parameter.
167233

234+
Numbers
235+
-------
236+
168237
``b`` (integer) [unsigned char]
169238
Convert a nonnegative Python integer to an unsigned tiny int, stored in a C
170239
:ctype:`unsigned char`.
@@ -207,13 +276,13 @@ variable(s) whose address should be passed.
207276
``n`` (integer) [Py_ssize_t]
208277
Convert a Python integer to a C :ctype:`Py_ssize_t`.
209278

210-
``c`` (string of length 1) [char]
211-
Convert a Python character, represented as a byte string of length 1, to a C
212-
:ctype:`char`.
279+
``c`` (bytes object of length 1) [char]
280+
Convert a Python byte, represented as a :class:`bytes` object of length 1,
281+
to a C :ctype:`char`.
213282

214-
``C`` (string of length 1) [int]
215-
Convert a Python character, represented as a unicode string of length 1, to a
216-
C :ctype:`int`.
283+
``C`` (Unicode object of length 1) [int]
284+
Convert a Python character, represented as a :class:`str`: object of
285+
length 1, to a C :ctype:`int`.
217286

218287
``f`` (float) [float]
219288
Convert a Python floating point number to a C :ctype:`float`.
@@ -224,6 +293,9 @@ variable(s) whose address should be passed.
224293
``D`` (complex) [Py_complex]
225294
Convert a Python complex number to a C :ctype:`Py_complex` structure.
226295

296+
Other objects
297+
-------------
298+
227299
``O`` (object) [PyObject \*]
228300
Store a Python object (without any conversion) in a C object pointer. The C
229301
program thus receives the actual object that was passed. The object's reference
@@ -258,39 +330,6 @@ variable(s) whose address should be passed.
258330
.. versionchanged:: 3.1
259331
Py_CLEANUP_SUPPORTED was added.
260332

261-
``S`` (string) [PyStringObject \*]
262-
Like ``O`` but requires that the Python object is a string object. Raises
263-
:exc:`TypeError` if the object is not a string object. The C variable may also
264-
be declared as :ctype:`PyObject\*`.
265-
266-
``U`` (Unicode string) [PyUnicodeObject \*]
267-
Like ``O`` but requires that the Python object is a Unicode object. Raises
268-
:exc:`TypeError` if the object is not a Unicode object. The C variable may also
269-
be declared as :ctype:`PyObject\*`.
270-
271-
``t#`` (read-only character buffer) [char \*, int]
272-
Like ``s#``, but accepts any object which implements the read-only buffer
273-
interface. The :ctype:`char\*` variable is set to point to the first byte of
274-
the buffer, and the :ctype:`int` is set to the length of the buffer. Only
275-
single-segment buffer objects are accepted; :exc:`TypeError` is raised for all
276-
others.
277-
278-
``w`` (read-write character buffer) [char \*]
279-
Similar to ``s``, but accepts any object which implements the read-write buffer
280-
interface. The caller must determine the length of the buffer by other means,
281-
or use ``w#`` instead. Only single-segment buffer objects are accepted;
282-
:exc:`TypeError` is raised for all others.
283-
284-
``w*`` (read-write byte-oriented buffer) [Py_buffer]
285-
This is to ``w`` what ``s*`` is to ``s``.
286-
287-
``w#`` (read-write character buffer) [char \*, int]
288-
Like ``s#``, but accepts any object which implements the read-write buffer
289-
interface. The :ctype:`char \*` variable is set to point to the first byte
290-
of the buffer, and the :ctype:`int` is set to the length of the buffer.
291-
Only single-segment buffer objects are accepted; :exc:`TypeError` is raised
292-
for all others.
293-
294333
``(items)`` (tuple) [*matching-items*]
295334
The object must be a Python sequence whose length is the number of format units
296335
in *items*. The C arguments must correspond to the individual format units in
@@ -339,6 +378,8 @@ false and raise an appropriate exception. When the
339378
of the format units, the variables at the addresses corresponding to that
340379
and the following format units are left untouched.
341380

381+
API Functions
382+
-------------
342383

343384
.. cfunction:: int PyArg_ParseTuple(PyObject *args, const char *format, ...)
344385

@@ -415,6 +456,10 @@ and the following format units are left untouched.
415456
PyArg_ParseTuple(args, "O|O:ref", &object, &callback)
416457

417458

459+
---------------
460+
Building values
461+
---------------
462+
418463
.. cfunction:: PyObject* Py_BuildValue(const char *format, ...)
419464

420465
Create a new value based on a format string similar to those accepted by the

0 commit comments

Comments
 (0)