Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@nascheme
Copy link

@nascheme nascheme commented Aug 20, 2025

This branch adds support for the free-threaded (nogil) build of CPython. It includes the work done by Lysandros (use strong references, critical sections for Message objects). I excluded the changes to the "cpp" backend since I think we should focus on support for upb only at this time. These changes are designed such that they shouldn't affect the behavior or performance of the default (non-free-threaded) build.

Lysandraos's PRs superseded by this one:

Additional changes I made:

  • Add recursive mutex implementation, essentially the same as threading.RLock()
  • Add locking for the ObjCache structure.
  • Add a critical section for the DescriptorPool object.
  • Allocate the c_descriptor_symtab data on module init, avoiding thread-safety issue.
  • Change PyUpb_WeakMap to use strong references.
  • Modify the py_wheel function in python/dist/BUILD.bazel so that free-threaded wheel files and .so files have to correct names (free-threaded build does not yet have a stable ABI).

I have some limited multi-threaded testing that I've run with the TSAN build. These changes fix all TSAN warnings and the tests complete without crashing (previously they caused a SEGV).

I've also tested the multi-threaded scaling of the free-threading build of the modified library. Unfortunately, it doesn't scale well when used from multiple parallel threads. The ObjCache structure is a bottleneck since it is updated on every Message creation and deletion. I think that could be fixed by having some lock-free hash map operatations for ObjCache (i.e. make the fast path where the object already exists not need to acquire the lock). It's not clear to me if upb_inttable_* is thread-safe without locking. I think that scaling work could come later, after we have it working without crashing.

Instructions on testing this:

  • Install bazelisk
  • Have a Python 3.14t build somewhere, preferably with TSAN and --with-pydebug options on.
  • Add the "bin" folder for that Python to your PATH
  • I use the following shell script to build with bazel, using Clang as the compiler:
#!/bin/sh
export PATH=$HOME/tmp/py-3.14t-tsan/bin:/usr/local/bin:/usr/bin:/bin
CC=clang CXX=clang++ bazelisk build  --config=tsan --compilation_mode=dbg --noenable_bzlmod //python/dist:binary_wheel
  • Install the package: python3 -m pip install --force-reinstall bazel-bin/python/dist/protobuf-6.33.0-cp314-cp314t-linux_x86_64.whl

  • Run tests, e.g. python3 pb_threaded.py

Attached is the pb_threaded.py script I was using to test. It is too simple yet and we need some better testing. Perhaps we can use pytest-run-parallel and run the existing unit tests.

pb_threaded.py.txt

lysnikolaou and others added 5 commits August 20, 2025 13:11
* Add recursive mutex locking for ObjCache
* Allocate c_descriptor_symtab on init.
* Use critical section for DescriptorPool
Drop this patch before merging PR, fix this some other way.
@nascheme nascheme requested a review from a team as a code owner August 20, 2025 21:14
@nascheme nascheme requested review from ericsalo and removed request for a team August 20, 2025 21:14
@google-cla
Copy link

google-cla bot commented Aug 20, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@zhangskz zhangskz requested review from anandolee and removed request for ericsalo August 20, 2025 21:27
@zhangskz zhangskz added python 🅰️ safe for tests Mark a commit as safe to run presubmits over labels Aug 21, 2025
This is a more useful condition to check.
Dealloc methods call _Delete, not _DeleteLockHeld so we need to handle
the missing module state here to avoid crashing on shutdown.
@googleberg
Copy link
Member

@nascheme can you please check that you've signed the CLA?

@nascheme
Copy link
Author

@nascheme can you please check that you've signed the CLA?

As of today, I believe it should be. My employer (Quansight) has already signed the CLA and it was a matter of adding my github email address to the list of employees. Hopefully that takes care of it.

@nascheme
Copy link
Author

Just a heads up, I've been testing protobuf with a free-threaded version of grpc and found some issues. Specifically, we need to be more careful about the weak-map in order to avoid reference counting issues in the free-threaded build. I don't have a working fix yet but hopefully soon. We will likely need to use PyUnstable_TryIncRef() since PyUpb_ObjCache_Get() is able to resurrect objects with refcnt == 0.

@ngoldbaum
Copy link

Probably worth noting that this includes the commits from #22736. @nascheme maybe you should clarify what the maintainers should do with that PR, given that this has diverged a little bit.

@github-actions github-actions bot added untriaged auto added to all issues by default when created. and removed wait for user action labels Sep 4, 2025
In free-threaded build, using weak references adds some extra
complication (e.g. races with Py_DECREF()).  It's simpler to make the
mapping keep strong references to the map values.
@nascheme
Copy link
Author

nascheme commented Sep 25, 2025

Latest push fixes some crashes found when running a multi-threaded version of the grpc route_guide example. I avoided the need for PyUnstable_TryIncRef() by using strong references for the object map.

If the module has been finalized, PyState_FindModule() will return NULL.
Since we try to get the module state from various _Dealloc methods, we
need to handle this case.
@nascheme
Copy link
Author

Any feedback on what might need to be done to move this PR along? Based on my (admittedly somewhat limited) testing, it seems to work. It shouldn't impact the non-free-threaded build since the alternative logic is either conditional on free-threading being enabled or is a no-op for the default build (like the locking functions).

@honglooker honglooker removed the untriaged auto added to all issues by default when created. label Oct 20, 2025
Free-threaded builds don't have a stable ABI.  Use an ABI tag based
on the CPython version number.
This is a bit more efficient.  We handle obj_cache being NULL in the
case of shutdown.
The logic to do this is complex but it is required to avoid leaking
memory.
@anandolee
Copy link
Contributor

anandolee commented Nov 17, 2025

The free threading support is under discussion. We will also document what will be thread safe and what are not thread safe. Sharing messages are not thread safe usages we will support.

We will submit some experimental free threading PRs for python cpp soon. UPB may take longer time for design, because message is under global object map. Add locks to each message is not what we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🅰️ safe for tests Mark a commit as safe to run presubmits over python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants