Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH: Please support subinterpreters #24755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mkostousov opened this issue Sep 20, 2023 · 12 comments
Open

ENH: Please support subinterpreters #24755

mkostousov opened this issue Sep 20, 2023 · 12 comments

Comments

@mkostousov
Copy link

Proposed new feature or change:

Version 1.25.1, Python 3.12.re02
After enabling interpreters in Python C Api:

PyInterpreterConfig config = {
.check_multi_interp_extensions = 1,
.gil = PyInterpreterConfig_OWN_GIL,
};
PyThreadState *tstate = NULL;
PyStatus status = Py_NewInterpreterFromConfig(&tstate, &config);
if (PyStatus_Exception(status)) {
return -1;
}

Import numpy throws an exception:
module numpy.core._multiarray._umath does not support loading in subinterpreters

@mattip mattip changed the title Support for subinterpreters ENH: Please support subinterpreters Sep 21, 2023
@mattip
Copy link
Member

mattip commented Sep 21, 2023

PEP 554 states:

To mitigate that impact and accelerate compatibility, we will do the following:

  • be clear that extension modules are not required to support use in multiple interpreters
  • raise ImportError when an incompatible module is imported in a subinterpreter
  • provide resources (e.g. docs) to help maintainers reach compatibility
  • reach out to the maintainers of Cython and of the most used extension modules (on PyPI) to get feedback and possibly provide assistance

The PEP also links to Isolating Extensions which has a lot of theory, but does not clearly state how to migrate a large existing c-extension library like NumPy to support subinterpreters. I think we would need to:

  • move to HeapTypes
  • move all static state into module state
  • carefully analyze code for possible shared state.

I am a bit unclear whether subinterpreters share a single GIL, if not we would also have to carefully examine the code for possible race conditions.

This is a lot of work, and may have performance implications. What is your use case for subinterpreters? Do you think you could help with the effort or find funding for this effort?

@rgommers
Copy link
Member

@seberg
Copy link
Member

seberg commented Sep 26, 2023

This is a lot of work, and may have performance implications. What is your use case for subinterpreters? Do you think you could help with the effort or find funding for this effort?

I suspect the vast majority of changes to be relatively easy, but there is still the same problem that we need someone to explicitly dedicate time on this, and I doubt it will be one of the current core devs.
We even added a warning a long time back saying exactly that, but it seems CPython changes to make subinterpreter support better in the long-run now enforces an error rather than a warning.

@a-reich
Copy link

a-reich commented Oct 10, 2023

PEP 554 states: …

FWIW the recent CPython changes should be from PEP 684 “Per-Interpreter GIL”; PEP 554, for the Python API and subinterpreter management features, is still in draft status.

@mdekstrand
Copy link

There's a very strong use case for subinterpreters since PEP 684 for parallel processing that I expect would be useful to a lot of numpy client code: using subinterpreters in separate threads will enable shared memory (at least in a read-only case) with significantly less hassle than multiprocessing.

@a-reich
Copy link

a-reich commented Nov 10, 2023

I’m also very excited about the potential opportunities of using subinterpreters with numpy, and agree with what @mdekstrand said. In particular, the latest draft of PEP 734 discusses sharing data via the buffer protocol (and already implemented in the private interpreters module since 3.13a1). Since ndarrays can export their buffer or be created from one without copies, this could be a very nice pattern:

  • pickle your array with protocol 5 to get some serialized metadata plus the memoryview,
  • pass that view to a bunch of interpreters (which is basically instant) as well as the small metadata,
  • and unpickle: now all of them are sharing the data in each of their arrays
  • And if you don’t want to worry about data races, seems like np can handle that by setting the readonly flag.

You get concurrency with performant, opt-in data sharing, without the hassles of managing subprocesses and using multiprocessing.shared_memory where you have to create a shared buffer of fixed size ahead of time and only create arrays using that. With interpreters you can take any random array you got and easily share it.

@paultiq
Copy link

paultiq commented Oct 23, 2024

InterpreterPoolExecutor's are to be introduced in 3.14 124548.

At present, numpy imports fail due to "ImportError: module numpy._core._multiarray_umath does not support loading in subinterpreters":

See following example: TPE and PPE work, IPE does not.

from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import ProcessPoolExecutor
from interpreters_backport.concurrent.futures.interpreter import InterpreterPoolExecutor

def try_tpe():
    with ThreadPoolExecutor() as executor:
        executor.submit(exec, "import numpy as np;print('TPE:', np.random.rand(2))")

def try_ppe():
    with ProcessPoolExecutor() as executor:
        executor.submit(exec, "import numpy as np;print('PPE:', np.random.rand(2))")

def try_ipe():
    with InterpreterPoolExecutor() as executor:
        f = executor.submit(exec, "import numpy as np;print('IPE:', np.random.rand(2))")
        f.result()

if __name__ == "__main__":
    try_tpe()
    try_ppe()
    try_ipe()

Footnote: I noted this recent comment, which perhaps closed the door on hope here; #27192 (comment)

@temeddix
Copy link

It's a sad thing that even the possibilities are closed. Now Python is heading towards true multithreaded parallelism with InterpreterPoolExecutor and Numpy will not be able to deal with new demands..

@rgommers
Copy link
Member

It's a sad thing that even the possibilities are closed. Now Python is heading towards true multithreaded parallelism with InterpreterPoolExecutor and Numpy will not be able to deal with new demands..

They aren't closed. A lot of work has happened (and is still happening) to supported free-threaded CPython over the past 6 months. Pretty much all that work is also directly relevant to support for subinterpreters. The extra work needed for subinterpreters - primarily moving to heap types I believe - will need someone to dig in and do that work though. None of the active maintainers are working on this, however contributions are very much welcome.

@mattip
Copy link
Member

mattip commented Oct 30, 2024

The list of tasks is still this, although the documentation has gotten better:

  • move all static types to HeapTypes (i.e. search for static PyTypeObject and replace that with PyType_FromModuleAndSpec() or some other heap allocation of the type. This can be done one type at a time, and must be benchmarked for performance implications.
  • move all static state into module state. Much of the groundwork for this has been done in the free-threading code fixes. Again, better to go in small increments and to benchmark for performance implications.
  • carefully analyze code for other possible shared state.

@paultiq
Copy link

paultiq commented Oct 30, 2024

@mattip I imagine cython support for subinterpreters would also be required? cython/cython#6445

@mattip
Copy link
Member

mattip commented Oct 30, 2024

Yes, that is part of the "carefully analyze code for other possible shared state". Cython does support the constructs needed for generating code compatible with subinterpreters, the question is more "are we using unsafe coding practices in our cython code?"

Another area that will need adaptation is the f2py generated wrappers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants