-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
BUG: fix data race in PyArray_DescrHash
#30234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Can you add a multithreaded test that exercises this code? It's not clear to me how you'd hit a race in this code path or why this is coming up now. I want to make sure there aren't other issues like this that need to be fixed. |
You can reproduce the data race by running ==================
WARNING: ThreadSanitizer: data race (pid=79191)
Write of size 8 at 0x0001118e8390 by thread T8:
#0 PyArray_DescrHash hashdescr.c:314 (_multiarray_umath.cpython-314t-darwin.so:arm64+0x1f636c)
#1 PyObject_Hash <null> (libpython3.14t.dylib:arm64+0x1365a8)
#2 tuple_hash <null> (libpython3.14t.dylib:arm64+0x17ddfc)
#3 PyObject_Hash <null> (libpython3.14t.dylib:arm64+0x1365a8)
#4 PyDict_GetItemRef <null> (libpython3.14t.dylib:arm64+0x104b3c)
#5 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x27a864)
#6 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#7 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#8 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f598)
#9 _PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8ad8c)
#10 PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8aeb4)
#11 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x27fbc0)
#12 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#13 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#14 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f638)
#15 context_run <null> (libpython3.14t.dylib:arm64+0x2c3eec)
#16 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x283168)
#17 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#18 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#19 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f638)
#20 _PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8ae40)
#21 PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8aeb4)
#22 thread_run <null> (libpython3.14t.dylib:arm64+0x416b08)
#23 pythread_wrapper <null> (libpython3.14t.dylib:arm64+0x368cac)
Previous read of size 8 at 0x0001118e8390 by thread T5:
#0 PyArray_DescrHash hashdescr.c:313 (_multiarray_umath.cpython-314t-darwin.so:arm64+0x1f620c)
#1 PyObject_Hash <null> (libpython3.14t.dylib:arm64+0x1365a8)
#2 tuple_hash <null> (libpython3.14t.dylib:arm64+0x17ddfc)
#3 PyObject_Hash <null> (libpython3.14t.dylib:arm64+0x1365a8)
#4 PyDict_GetItemRef <null> (libpython3.14t.dylib:arm64+0x104b3c)
#5 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x27a864)
#6 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#7 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#8 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f598)
#9 _PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8ad8c)
#10 PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8aeb4)
#11 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x27fbc0)
#12 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#13 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#14 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f638)
#15 context_run <null> (libpython3.14t.dylib:arm64+0x2c3eec)
#16 _PyEval_EvalFrameDefault <null> (libpython3.14t.dylib:arm64+0x283168)
#17 _PyEval_Vector <null> (libpython3.14t.dylib:arm64+0x278c90)
#18 _PyFunction_Vectorcall <null> (libpython3.14t.dylib:arm64+0x8b1cc)
#19 method_vectorcall <null> (libpython3.14t.dylib:arm64+0x8f638)
#20 _PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8ae40)
#21 PyObject_Call <null> (libpython3.14t.dylib:arm64+0x8aeb4)
#22 thread_run <null> (libpython3.14t.dylib:arm64+0x416b08)
#23 pythread_wrapper <null> (libpython3.14t.dylib:arm64+0x368cac)
SUMMARY: ThreadSanitizer: data race hashdescr.c:314 in PyArray_DescrHash
==================
I didn't find any other place where numpy caches the hash of object so I think this is the only place which needs to be fixed. |
|
As a data point, CPython caches hashes for many objects and I had fixed data race in CPython similar to how it is done here. See python/cpython#139775 which fixed similar data races in datetime. |
|
Thanks for the extra context. I'll give this a once-over on Monday. |
Relates #30085
This PR fixes data race in
PyArray_DescrHashondescr->hash. The non-atomic load and store of hash can data race so it is now changed to use atomics. Relaxed ordering would be sufficient here but I don't think it is worth it so I have reusednpy_atomic_{load/store}_ptras the size of hash is same as pointer and uses seq-cst ordering.