Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 6390138

Browse files
authored
Stackless issue python#290: bpo-36974, PEP 590 Vectorcall-protocol (documentation)
Update the API Documentation. There are changes to the Stackless-protocol and a few new macros (STACKLESS_VECTORCALLxxxx).
1 parent 0475a65 commit 6390138

File tree

3 files changed

+255
-60
lines changed

3 files changed

+255
-60
lines changed

Doc/c-api/stackless.rst

Lines changed: 124 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,28 @@
33
|SLP| C-API
44
===========
55

6+
|SLP| uses two fundamentally different methods to switch control
7+
flow from one tasklet to another. One method, called *hard-switching*,
8+
manipulates the C-stack with hardware-dependent assembly code. This is always
9+
possible, but somewhat costly. The other method, called *soft-switching*, is
10+
only possible under special conditions, but is cheap. Moreover, soft-switching
11+
allows the storage (pickling) and recovery (unpickling) of active tasklets.
12+
13+
Soft-switching avoids recursive calls to the |PY| interpreter, such as those
14+
that occur when calling a |PY| function, by maintaining a chained list of
15+
tasks that are processed sequentially. This list consists of
16+
:c:type:`PyFrameObject` and :c:type:`PyCFrameObject`
17+
objects chained by their :c:member:`PyFrameObject.f_back` pointer. In the C-function
18+
:c:func:`slp_dispatch` (and :c:func:`slp_dispatch_top`) the list is processed
19+
in a loop. In order to proceed
20+
to the processing of the next (C)frame, all C-functions involved in the
21+
processing of the current (C)frame must return. A special return value
22+
Unwind-Token is used here. If a C-function returns the value :c:data:`Py_UnwindToken`,
23+
its caller must add any unfinished tasks to the (C)frame list and return
24+
:c:data:`Py_UnwindToken` itself. It follows that *soft-switching* is only possible if
25+
it is supported by all functions just called. If this is not the case,
26+
*hard-switching* remains as a fallback.
27+
628
.. note::
729

830
Some switching functions have a variant with the
@@ -383,21 +405,21 @@ Soft-switchable extension functions
383405
The API for soft-switchable extension function has been added on a
384406
provisional basis (see :pep:`411` for details.)
385407
386-
A soft switchable extension function or method is a function or method defined
408+
A soft-switchable extension function or method is a function or method defined
387409
by an extension module written in C. In contrast to an normal C-function you
388410
can soft-switch tasklets while this function executes. Soft-switchable functions
389-
obey the stackless-protocol. At the C-language level
411+
obey the Stackless-protocol. At the C-language level
390412
such a function or method is made from 3 C-definitions:
391413
392414
1. A declaration object of type :c:type:`PyStacklessFunctionDeclaration_Type`.
393415
It declares the soft-switchable function and must be declared as a global
394416
variable.
395417
2. A conventional extension function, that uses
396-
:c:func:`PyStackless_CallFunction` to call the soft switchable function.
418+
:c:func:`PyStackless_CallFunction` to call the soft-switchable function.
397419
3. A C-function of type ``slp_softswitchablefunc``. This function provides the
398-
implemantation of the soft switchable function.
420+
implemantation of the soft-switchable function.
399421
400-
To create a soft switchable function declaration simply define it as a static
422+
To create a soft-switchable function declaration simply define it as a static
401423
variable and call :c:func:`PyStackless_InitFunctionDeclaration` from your
402424
module init code to initialise it. See the example code in the source
403425
of the extension module `_teststackless <https://github.com/stackless-dev/stackless/blob/master-slp/Stackless/module/_teststackless.c>`_.
@@ -410,7 +432,7 @@ Typedef ``slp_softswitchablefunc``::
410432
411433
.. c:type:: PyStacklessFunctionDeclarationObject
412434
413-
This subtype of :c:type:`PyObject` represents a Stackless soft switchable
435+
This subtype of :c:type:`PyObject` represents a Stackless soft-switchable
414436
extension function declaration object.
415437
416438
Here is the structure definition::
@@ -437,7 +459,7 @@ Typedef ``slp_softswitchablefunc``::
437459
.. c:var:: PyTypeObject PyStacklessFunctionDeclaration_Type
438460
439461
This instance of :c:type:`PyTypeObject` represents the Stackless
440-
soft switchable extension function declaration type.
462+
soft-switchable extension function declaration type.
441463
442464
.. c:function:: int PyStacklessFunctionDeclarationType_CheckExact(PyObject *p)
443465
@@ -446,7 +468,7 @@ Typedef ``slp_softswitchablefunc``::
446468
447469
.. c:function:: PyObject* PyStackless_CallFunction(PyStacklessFunctionDeclarationObject *sfd, PyObject *arg, PyObject *ob1, PyObject *ob2, PyObject *ob3, long n, void *any)
448470
449-
Invoke the soft switchable extension, which is represented by *sfd*.
471+
Invoke the soft-switchable extension, which is represented by *sfd*.
450472
Pass *arg* as initial value for argument *retval* and *ob1*, *ob2*, *ob3*,
451473
*n* and *any* as general purpose in-out-arguments.
452474
@@ -457,46 +479,72 @@ Typedef ``slp_softswitchablefunc``::
457479
Initialize the fields :c:member:`PyStacklessFunctionDeclarationObject.name` and
458480
:c:member:`PyStacklessFunctionDeclarationObject.module_name` of *sfd*.
459481
460-
Within the body of a soft switchable extension function (or any other C-function, that obyes the stackless-protocol)
482+
Within the body of a soft-switchable extension function (or any other C-function, that obeys the stackless-protocol)
461483
you need the following macros.
462484
463-
Macros for the "stackless-protocol"
485+
Macros for the "Stackless-protocol"
464486
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
465487
466-
How to does Stackless Python decide, if a function may return an unwind-token?
467-
There is one global variable "_PyStackless_TRY_STACKLESS"[#]_ which is used
468-
like an implicit parameter. Since we don't have a real parameter,
469-
the flag is copied into the local variable "stackless" and cleared.
470-
This is done by the STACKLESS_GETARG() macro, which should be added to
471-
the top of the function's declarations.
472488
473-
The idea is to keep the chances to introduce error to the minimum.
474-
A function can safely do some tests and return before calling
475-
anything, since the flag is in a local variable.
476-
Depending on context, this flag is propagated to other called
477-
functions. They *must* obey the protocol. To make this sure,
478-
the STACKLESS_ASSERT() macro has to be called after every such call.
489+
How does a C-function in |SLP| decide whether it may return
490+
:c:data:`Py_UnwindToken`? (After all, this is only allowed if the caller can handle
491+
:c:data:`Py_UnwindToken`). The obvious thing would be to use your own function
492+
argument, but that would change the function prototypes and thus
493+
Python's C-API. This is not practical. Instead, the global variable
494+
"_PyStackless_TRY_STACKLESS"[#f1]_ is used as an implicit parameter.
495+
496+
The content of this variable is moved to the local variable "stackless"
497+
at the beginning of a C function. In the process, "_PyStackless_TRY_STACKLESS"
498+
is set to 0, indicating that no unwind-token may be returned.
499+
This is done with the macro :c:func:`STACKLESS_GETARG` or, for vectorcall [#f2]_ functions,
500+
with the macro :c:func:`STACKLESS_VECTORCALL_GETARG`, which should be added at the
501+
beginning of the function declaration.
502+
503+
This design minimizes the possibility of introducing errors due to improper
504+
return of :c:data:`Py_UnwindToken`. The function can contain arbitrary code because the
505+
flag is hidden in a local variable. If the function is to support
506+
*soft-switching*, it must be further adapted. The flag may only be passed to
507+
other called functions if they adhere to the Stackless-protocol. The macros
508+
STACKLESS_PROMOTExxx() serve this purpose. To ensure compliance with the
509+
protocol, the macro :c:func:`STACKLESS_ASSERT` must be called after each such call.
510+
An exception is the call of vectorcall functions. The call of a vectorcall
511+
function must be framed with the macros :c:func:`STACKLESS_VECTORCALL_BEFORE` and
512+
:c:func:`STACKLESS_VECTORCALL_AFTER` or - more simply - performed with the macro
513+
:c:func:`STACKLESS_VECTORCALL`.
479514
480515
Many internal functions have been patched to support this protocol.
481516
Their first action is a direct or indirect call of the macro
482-
:c:func:`STACKLESS_GETARG`.
517+
:c:func:`STACKLESS_GETARG` or :c:func:`STACKLESS_VECTORCALL_GETARG`.
483518
484519
.. c:function:: STACKLESS_GETARG()
485520
486-
Define the local variable ``int stackless`` and move the global
487-
"_PyStackless_TRY_STACKLESS" flag into the local variable "stackless".
488-
After a call to :c:func:`STACKLESS_GETARG` the value of
521+
Define and initialize the local variable ``int stackless``.
522+
The value of *stackless* is non-zero, if the function may return
523+
:c:data:`Py_UnwindToken`.
524+
After a call to :c:func:`STACKLESS_GETARG` the value of the global variable
489525
"_PyStackless_TRY_STACKLESS" is 0.
490526
527+
.. c:function:: STACKLESS_VECTORCALL_GETARG(func)
528+
529+
.. versionadded:: 3.8.0
530+
531+
Vectorcall variant of the macro :c:func:`STACKLESS_GETARG`. Functions of type
532+
:c:type:`vectorcallfunc` must use :c:func:`STACKLESS_VECTORCALL_GETARG` instead
533+
of :c:func:`STACKLESS_GETARG`. The argument *func* must be set to the vectorcall
534+
function itself. See function :c:func:`_PyCFunction_Vectorcall` for an example.
535+
491536
.. c:function:: STACKLESS_PROMOTE_ALL()
492537
493-
All STACKLESS_PROMOTE_xxx macros are used to propagate the stackless-flag
538+
All STACKLESS_PROMOTExxx() macros are used to propagate the stackless-flag
494539
from the local variable "stackless" to the global variable
495-
"_PyStackless_TRY_STACKLESS". The macro :c:func:`STACKLESS_PROMOTE_ALL` does
496-
this unconditionally. It is used for cases where we know that the called
497-
function will take care of our object, and we need no test. For example,
498-
:c:func:`PyObject_Call` and all other Py{Object,Function,CFunction}_*Call*
499-
functions use STACKLESS_PROMOTE_xxx itself, so we don't need to check further.
540+
"_PyStackless_TRY_STACKLESS". These macros can't be used to call a vectorcall [#f2]_ function.
541+
542+
The macro :c:func:`STACKLESS_PROMOTE_ALL` does this unconditionally.
543+
It is used for cases where we know that the called function obeys
544+
the stackless-protocol by calling STACKLESS_GETARG() and possibly
545+
returning the unwind token. For example, PyObject_Call() and all other
546+
Py{Object,Function,CFunction}_*Call* functions use STACKLESS_GETARG() and
547+
STACKLESS_PROMOTE_xxx itself, so we don’t need to check further.
500548
501549
.. c:function:: STACKLESS_PROMOTE_FLAG(flag)
502550
@@ -539,6 +587,40 @@ Their first action is a direct or indirect call of the macro
539587
Set the global variable "_PyStackless_TRY_STACKLESS" unconditionally to 0.
540588
Rarely used.
541589
590+
.. c:function:: STACKLESS_VECTORCALL_BEFORE(func)
591+
592+
.. c:function:: STACKLESS_VECTORCALL_AFTER(func)
593+
594+
.. versionadded:: 3.8.0
595+
596+
If a C-function needs to propagate the stackless-flag
597+
from the local variable "stackless" to the global variable
598+
"_PyStackless_TRY_STACKLESS" in order to call a vectorcall [#f2]_ function, it
599+
must frame the call with these macros. Set the argument *func* to the called
600+
function. The called function *func* is not required to support the
601+
Stackless-protocol. [#f3]_ Example:
602+
603+
.. code-block:: C
604+
605+
STACKLESS_GETARG();
606+
vectorcallfunc func = a_vectorcal_function;
607+
608+
/* other code */
609+
610+
STACKLESS_VECTORCALL_BEFORE(func);
611+
PyObject * result = func(callable, args, nargsf, kwnames);
612+
STACKLESS_VECTORCALL_AFTER(func);
613+
return result;
614+
615+
.. c:function:: STACKLESS_VECTORCALL(func, callable, args, nargsf, kwnames)
616+
617+
.. versionadded:: 3.8.0
618+
619+
Call the vectorcall function *func* with the given arguments and return
620+
the result. It is a convenient alternative to the macros
621+
:c:func:`STACKLESS_VECTORCALL_BEFORE` and :c:func:`STACKLESS_VECTORCALL_AFTER`.
622+
The called function *func* is not required to support the Stackless-protocol.
623+
542624
Examples
543625
~~~~~~~~
544626
@@ -557,9 +639,18 @@ Another, more realistic example is :py:const:`_asyncio._task_step_impl_stackless
557639
"Modules/_asynciomodules.c".
558640
559641
560-
.. [#] Actually "_PyStackless_TRY_STACKLESS" is a macro that expands to a C L-value. As long as
642+
.. [#f1] Actually "_PyStackless_TRY_STACKLESS" is a macro that expands to a C L-value. As long as
561643
|CPY| uses the GIL, this L-value is a global variable.
562644
645+
.. [#f2] See :pep:`590` Vectorcall: a fast calling protocol for CPython
646+
647+
.. [#f3] If a |PY| type supports the :pep:`590` Vectorcall-protocol the actual :c:type:`vectorcallfunc`
648+
C-function is a per object property. This speeds up calling vectorcall functions on classes,
649+
but the consequence is, that it is no longer possible to use a flag in the type to indicate,
650+
if the vectorcall slot supports the Stackless-protocol. Therefore |SLP|
651+
has special macros to deal with vectorcall functions.
652+
653+
563654
Debugging and monitoring Functions
564655
----------------------------------
565656

Include/internal/pycore_slp_pystate.h

Lines changed: 63 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -70,14 +70,70 @@ typedef struct {
7070
*/
7171
struct _stackless_runtime_state {
7272
/*
73-
* flag whether the next call should try to be stackless.
74-
* The protocol is: This flag may be only set if the called
75-
* thing supports it. It doesn't matter whether it uses the
76-
* chance, but it *must* set it to zero before returning.
77-
* This flags in a way serves as a parameter that we don't have.
73+
* Note: The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
74+
* "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL"
75+
* in this comment are to be interpreted as described in RFC 2119.
7876
*
79-
* As long as the GIL is shared between sub-interpreters,
80-
* try_stackless can be a field in the runtime state.
77+
* 'try_stackless': flag whether the next call into the interpreter should
78+
* try to be stackless
79+
*
80+
* This flags in a way serves as a parameter that we don't have. It can
81+
* be accessed as a L-value using the macro '_PyStackless_TRY_STACKLESS'.
82+
*
83+
* Possible values of 'try_stackless' / '_PyStackless_TRY_STACKLESS':
84+
*
85+
* 0: Stackless calls are not possible, the called function MUST NOT
86+
* return Py_UnwindToken.
87+
*
88+
* 1: Stackless calls are possible. The called function MUST ensure that
89+
* _PyStackless_TRY_STACKLESS is 0 on its return.
90+
*
91+
* other: if the value of _PyStackless_TRY_STACKLESS is the address of the
92+
* called function, Stackless calls are possible. The called function
93+
*
94+
* - either MUST NOT modify the value of _PyStackless_TRY_STACKLESS
95+
* (case for unmodified C-Python functions).
96+
*
97+
* - or MUST ensure that _PyStackless_TRY_STACKLESS is 0 on its
98+
* return (case for stackless aware functions).
99+
*
100+
* (If a stackless call is possible, the called function SHOULD return
101+
* Py_UnwindToken and insert an appropriate (C)-frame into the frame chain if
102+
* otherwise a recursive call into the Python interpreter would have to be made.
103+
* To do so, the called function MAY call a sub-function with
104+
* _PyStackless_TRY_STACKLESS set to a non-zero value (see macros STACKLESS_PROMOTE_xxx,
105+
* STACKLESS_ASSERT and STACKLESS_VECTORCALL) and return the result of the
106+
* sub-function.)
107+
*
108+
* The protocol for the caller is:
109+
*
110+
* This flag MAY be only set to 1 if the called thing is stackless aware
111+
* (== obeys the stackless-protocol == calls STACKLESS_GETARG() directly
112+
* or indirectly). It doesn't matter whether it uses the chance, but it
113+
* MUST set _PyStackless_TRY_STACKLESS to zero before returning.
114+
*
115+
* This flag may be set to the address of a directly called C-function.
116+
* It is not required, that the called function supports stackless
117+
* calls. This variant is used for "vectorcall"-functions (see PEP-590). If
118+
* a type supports the vectorcall-protocol, the called C-function not is stored
119+
* in a slot of the type object. Instead each instance of the type has its own
120+
* function pointer.This makes it impossible to decide if the function to be called
121+
* obeys the stackless-protocol and therefore a vectorcall-function MUST NOT
122+
* be called with _PyStackless_TRY_STACKLESS set to 1.
123+
*
124+
* In theory we could always set _PyStackless_TRY_STACKLESS the address of the
125+
* called function, but this would not be efficient. A function, that only wraps
126+
* another stackless-aware function, does not need to use STACKLESS_GETARG(),
127+
* STACKLESS_PROMOTE_xxx and STACKLESS_ASSERT(). Also some stackless-aware functions
128+
* are "static inline" and have no address.
129+
*
130+
* To prevent leakage of a non zero value of _PyStackless_TRY_STACKLESS to other
131+
* threads, a thread must reset _PyStackless_TRY_STACKLESS before it drops the GIL.
132+
* This is done in the C-function drop_gil.
133+
*
134+
* As long as the GIL is shared between sub-interpreters and the runtime-state is a
135+
* global variable, "try_stackless" should be a field in the runtime state.
136+
* Once the GIL is no longer shared, we should move the flag into the thread state.
81137
*/
82138
intptr_t try_stackless;
83139

0 commit comments

Comments
 (0)