Support the free-threaded build of CPython #572

ngoldbaum · 2025-02-10T19:06:08Z

Feature request

Right now safetensors supports the free-threaded build in principle because it uses PyO3 0.23, but doesn't explicitly declare support. This means if you install it on the free-threaded build, Python prints a warning that it is re-enabling the GIL at runtime:

goldbaum at Mac in ~/Documents/safetensors on main
± pip install safetensors
Collecting safetensors
  Downloading safetensors-0.5.2.tar.gz (66 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: safetensors
  Building wheel for safetensors (pyproject.toml) ... done
  Created wheel for safetensors: filename=safetensors-0.5.2-cp313-cp313t-macosx_11_0_arm64.whl size=413610 sha256=ba806bc9fd2250873da3cafc58fc502321abc30df2d77c66b096aba4f709cc4d
  Stored in directory: /Users/goldbaum/Library/Caches/pip/wheels/4a/50/b2/61d052951768fd300fda20bfa3bc357184d4be288878086c21
Successfully built safetensors
Installing collected packages: safetensors
Successfully installed safetensors-0.5.2

goldbaum at Mac in ~/Documents
○  python
Python 3.13.1 experimental free-threading build (main, Dec 10 2024, 14:07:41) [Clang 16.0.0 (clang-1600.0.26.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import safetensors
<frozen importlib._bootstrap>:488: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'safetensors._safetensors_rust', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

Motivation

It should be possible and safe to use safetensors on the free-threaded build in effectively single-threaded contexts as people already do on the GIL-enabled build. I haven't dived into the safetensors internals to see what happens when state is shared between threads, but it should also hopefully be possible to detect situations like that and at a minimum generate a runtime error.

Additionally, users on the free-threaded build should be able to get pre-compiled wheels without needing a compiler toolchain to install it.

One significant wrinkle is that the free-threaded build doesn't yet support building extensions using the limited API, so you'll need to build version-specific free-threaded wheels if you want to upload wheels to PyPI.

Your contribution

I am a PyO3 maintainer and have helped other Rust libraries that depend on PyO3 to ship free-threaded wheels. I'm happy to help out here but am a newcomer to the codebase.

The text was updated successfully, but these errors were encountered:

Narsil · 2025-03-17T11:44:04Z

The library will import other librairies most of the time (most likely torch) which do not provide support just yet.

Before doing free threaded GIL, supporting correctly zero-copy memoryview is probably going to be easier, even though it's highly unsafe (given that Python can modify everything while it's being processed).

ngoldbaum · 2025-03-17T12:21:35Z

For torch, there are free-threaded PyTorch 2.6 wheels for Linux, so at least on Linux that’s unblocked.

pytorch/pytorch#130249

Not sure about other dependencies, I haven’t started looking closely at this repo yet.

Narsil · 2025-03-17T12:43:59Z

Well that's not really enough to claim support overall in release binaries is it ? (The actual pypi releases doesn't contain the freethreaded support)

Is there any place in pyo3 docs or elsewhere to know what kind of behavior we should be defending against ? In safetensors we keep a lot of references to python objects, I'm wondering how to make sure the behavior is correct.

In total honesty, threading/parallelism is rather useless in this lib, so requiring the GIL doesn't seem that crazy to me. (Any parallelism is always handled at a multiprocessing level, because there's a mutex deep in CUDA which somehow forbids multithreading, by making it abysmally sequential even in non Python).

ngoldbaum · 2025-03-17T13:15:46Z

Well that's not really enough to claim support overall in release binaries is it ?

No definitely not, I’m more talking about experimenting with the free-threaded build being unblocked.

Is there any place in pyo3 docs or elsewhere to know what kind of behavior we should be defending against ?

You can read more here:

https://py-free-threading.github.io/
https://pyo3.rs/v0.24.0/free-threading.html

I’m one of the authors of both of those links - please feel free to open issues if you have questions that aren’t answered, we want to make these docs really good.

In total honesty, threading/parallelism is rather useless in this lib, so requiring the GIL doesn't seem that crazy to me.

One approach you can take is to make it a hard runtime error to use a tensor simultaneously from multiple threads. Depending on what you’re doing you can depend in pyo3’s runtime borrow checking of data stored in pyobjects or implement it yourself using an atomic integer flag.

Keep in mind that the GIL itself is something of a house of cards and relying on it for thread safety can lead to issues. See e.g. this cryptography issue I ran into recently, where we fixed a thread safety issue due to implicitly relying on the GIL in pure python code leading to the possibility of a race to append to a bytestring if there is an unlucky thread switch at the right moment.

Of course implementing things in Rust helps a lot :)

Narsil · 2025-03-17T14:56:45Z

Thanks for the first link, it contained what I was looking for which is more strategy into testing this stuff.

I'm not sure it's totally complete, though, because here we're highly dependant of moving objects across threads (like tensors which are not owned by this lib). In general the rust side code is always almost trivial, while understanding what's valid to do in python and the boundaries is quite hard.

For instance here something like

f = safe_open(...)
t = Threading(whatever_fn, args=(f,))
t.start()

Or anything like that were users are agressively moving things in random order.

I'm having this issue where I need to write on disk the content of memoryview, however I would like to make sure the content is not modified (readonly is not True because tensors) and also afraid of other thread/processes modifying the data before it hits the disk.

Any ideas on how to do this without copying (which is what Iḿ currently doing) ?

ngoldbaum · 2025-03-17T15:16:31Z

I think the issue you’re describing is more or less what Alex Gaynor is talking about here?

https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/

I’m not sure there is a safe way to expose objects implementing the python buffer protocol to rust and expect any safety guarantees. IMO the PyBuffer API in PyO3 should be unsafe and long term we need to think about alternatives to the buffer protocol that are compatible with borrow checking.

Also all of this is equally true with the GIL or without, the free-threaded build just makes these issues easier to trigger.

I’m on vacation this week and probably won’t be able to participate more until next week. Also if you want to set up a call to talk about this stuff I’m happy to do that. My email is on my github profile.

Narsil · 2025-03-17T15:33:11Z

In essence yes, and yes the issue is unrelated to free threading (it's just that it's more likely to enable more real world bugs).

But the issue ultimately always ends up with &[ReadOnlyCell<u8>] to &[u8] safely (so we can write the bytes to disk.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support the free-threaded build of CPython #572

Support the free-threaded build of CPython #572

ngoldbaum commented Feb 10, 2025

Narsil commented Mar 17, 2025

ngoldbaum commented Mar 17, 2025

Narsil commented Mar 17, 2025

ngoldbaum commented Mar 17, 2025

Narsil commented Mar 17, 2025 •

edited

Loading

ngoldbaum commented Mar 17, 2025 •

edited

Loading

Narsil commented Mar 17, 2025

Support the free-threaded build of CPython #572

Support the free-threaded build of CPython #572

Comments

ngoldbaum commented Feb 10, 2025

Feature request

Motivation

Your contribution

Narsil commented Mar 17, 2025

ngoldbaum commented Mar 17, 2025

Narsil commented Mar 17, 2025

ngoldbaum commented Mar 17, 2025

Narsil commented Mar 17, 2025 • edited Loading

ngoldbaum commented Mar 17, 2025 • edited Loading

Narsil commented Mar 17, 2025

Narsil commented Mar 17, 2025 •

edited

Loading

ngoldbaum commented Mar 17, 2025 •

edited

Loading