Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit ceae4ed

Browse files
authored
gh-119786: Add InternalDocs/qsbr.md. (gh-135411)
Add internal doc for the Quiescent-State Based Reclamation (QSBR) implementation.
1 parent bda1218 commit ceae4ed

File tree

3 files changed

+132
-2
lines changed

3 files changed

+132
-2
lines changed

InternalDocs/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,9 @@ Program Execution
4242

4343
- [Exception Handling](exception_handling.md)
4444

45+
- [Quiescent-State Based Reclamation (QSBR)](qsbr.md)
4546

4647
Modules
4748
---
4849

49-
- [asyncio](asyncio.md)
50+
- [asyncio](asyncio.md)

InternalDocs/qsbr.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Quiescent-State Based Reclamation (QSBR)
2+
3+
## Introduction
4+
5+
When implementing lock-free data structures, a key challenge is determining
6+
when it is safe to free memory that has been logically removed from a
7+
structure. Freeing memory too early can lead to use-after-free bugs if another
8+
thread is still accessing it. Freeing it too late results in excessive memory
9+
consumption.
10+
11+
Safe memory reclamation (SMR) schemes address this by delaying the free
12+
operation until all concurrent read accesses are guaranteed to have completed.
13+
Quiescent-State Based Reclamation (QSBR) is a SMR scheme used in Python's
14+
free-threaded build to manage the lifecycle of shared memory.
15+
16+
QSBR requires threads to periodically report that they are in a quiescent
17+
state. A thread is in a quiescent state if it holds no references to shared
18+
objects that might be reclaimed. Think of it as a checkpoint where a thread
19+
signals, "I am not in the middle of any operation that relies on a shared
20+
resource." In Python, the eval_breaker provides a natural and convenient place
21+
for threads to report this state.
22+
23+
24+
## Use in Free-Threaded Python
25+
26+
While CPython's memory management is dominated by reference counting and a
27+
tracing garbage collector, these mechanisms are not suitable for all data
28+
structures. For example, the backing array of a list object is not individually
29+
reference-counted but may have a shorter lifetime than the `PyListObject` that
30+
contains it. We could delay reclamation until the next GC run, but we want
31+
reclamation to be prompt and to run the GC less frequently in the free-threaded
32+
build, as it requires pausing all threads.
33+
34+
Many operations in the free-threaded build are protected by locks. However, for
35+
performance-critical code, we want to allow reads to happen concurrently with
36+
updates. For instance, we want to avoid locking during most list read accesses.
37+
If a list is resized while another thread is reading it, QSBR provides the
38+
mechanism to determine when it is safe to free the list's old backing array.
39+
40+
Specific use cases for QSBR include:
41+
42+
* Dictionary keys (`PyDictKeysObject`) and list arrays (`_PyListArray`): When a
43+
dictionary or list that may be shared between threads is resized, we use QSBR
44+
to delay freeing the old keys or array until it's safe. For dicts and lists
45+
that are not shared, their storage can be freed immediately upon resize.
46+
47+
* Mimalloc `mi_page_t`: Non-locking dictionary and list accesses require
48+
cooperation from the memory allocator. If an object is freed and its memory is
49+
reused, we must ensure the new object's reference count field is at the same
50+
memory location. In practice, this means when a mimalloc page (`mi_page_t`)
51+
becomes empty, we don't immediately allow it to be reused for allocations of a
52+
different size class. QSBR is used to determine when it's safe to repurpose the
53+
page or return its memory to the OS.
54+
55+
56+
## Implementation Details
57+
58+
59+
### Core Implementation
60+
61+
The proposal to add QSBR to Python is contained in
62+
[Github issue 115103](https://github.com/python/cpython/issues/115103).
63+
Many details of that proposal have been copied here, so they can be kept
64+
up-to-date with the actual implementation.
65+
66+
Python's QSBR implementation is based on FreeBSD's "Global Unbounded
67+
Sequences." [^1][^2][^3]. It relies on a few key counters:
68+
69+
* Global Write Sequence (`wr_seq`): A per-interpreter counter, `wr_seq`, is started
70+
at 1 and incremented by 2 each time it is advanced. This ensures its value is
71+
always odd, which can be used to distinguish it from other state values. When
72+
an object needs to be reclaimed, `wr_seq` is advanced, and the object is tagged
73+
with this new sequence number.
74+
75+
* Per-Thread Read Sequence: Each thread has a local read sequence counter. When
76+
a thread reaches a quiescent state (e.g., at the eval_breaker), it copies the
77+
current global `wr_seq` to its local counter.
78+
79+
* Global Read Sequence (`rd_seq`): This per-interpreter value stores the minimum
80+
of all per-thread read sequence counters (excluding detached threads). It is
81+
updated by a "polling" operation.
82+
83+
To free an object, the following steps are taken:
84+
85+
1. Advance the global `wr_seq`.
86+
87+
2. Add the object's pointer to a deferred-free list, tagging it with the new
88+
`wr_seq` value as its qsbr_goal.
89+
90+
Periodically, a polling mechanism processes this deferred-free list:
91+
92+
1. The minimum read sequence value across all active threads is calculated and
93+
stored as the global `rd_seq`.
94+
95+
2. For each item on the deferred-free list, if its qsbr_goal is less than or
96+
equal to the new `rd_seq`, its memory is freed, and it is removed from the:
97+
list. Otherwise, it remains on the list for a future attempt.
98+
99+
100+
### Deferred Advance Optimization
101+
102+
To reduce memory contention from frequent updates to the global `wr_seq`, its
103+
advancement is sometimes deferred. Instead of incrementing `wr_seq` on every
104+
reclamation request, each thread tracks its number of deferrals locally. Once
105+
the deferral count reaches a limit (QSBR_DEFERRED_LIMIT, currently 10), the
106+
thread advances the global `wr_seq` and resets its local count.
107+
108+
When an object is added to the deferred-free list, its qsbr_goal is set to
109+
`wr_seq` + 2. By setting the goal to the next sequence value, we ensure it's safe
110+
to defer the global counter advancement. This optimization improves runtime
111+
speed but may increase peak memory usage by slightly delaying when memory can
112+
be reclaimed.
113+
114+
115+
## Limitations
116+
117+
Determining the `rd_seq` requires scanning over all thread states. This operation
118+
could become a bottleneck in applications with a very large number of threads
119+
(e.g., >1,000). Future work may address this with more advanced mechanisms,
120+
such as a tree-based structure or incremental scanning. For now, the
121+
implementation prioritizes simplicity, with plans for refinement if
122+
multi-threaded benchmarks reveal performance issues.
123+
124+
125+
## References
126+
127+
[^1]: https://youtu.be/ZXUIFj4nRjk?t=694
128+
[^2]: https://people.kernel.org/joelfernandes/gus-vs-rcu
129+
[^3]: http://bxr.su/FreeBSD/sys/kern/subr_smr.c#44

Python/qsbr.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
/*
22
* Implementation of safe memory reclamation scheme using
3-
* quiescent states.
3+
* quiescent states. See InternalDocs/qsbr.md.
44
*
55
* This is derived from the "GUS" safe memory reclamation technique
66
* in FreeBSD written by Jeffrey Roberson. It is heavily modified. Any bugs

0 commit comments

Comments
 (0)