-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Currently memory_order_acquire and memory_order_release are considered unsafe:
The problem is critical sections overlap in the following situation with mutexes or other synch object:
T1: a.acquire(); a.release(); b.acquire(); b.release(),
T2: b.acquire(); b.release(); a.acquire(); a.release();
Release reorders past subsequent unrelated acquire, so sections overlap and deadlock occurs.
The current resolution is believed to be the following:
- Acquire should observe the release result in a finite time, so release operations cannot be reordered past infinite amount of acquire attempts
- In hardware, memory changes take time to propagate, but relatively a very small time, definitely not infinite time
- In software, the compiler either does not reorder operations at all, or does not reorder them past potentially infinite amount of other operations
Unfortunately, 1 is not what Standard currently says, and 2 and 3 has to be confirmed with compiler vendors
Before the status of acquire / release is clarified, currently seq_cst is used in some places, specifically:
atomic_shared_ptrinternal spinlock:
Line 3130 in 12c684b
if (!_Repptr.compare_exchange_weak(_Rep, (_Rep & _Ptr_value_mask) | _Locked_notify_needed)) {
Line 3150 in 12c684b
uintptr_t _Rep = _Repptr.exchange(reinterpret_cast<uintptr_t>(_Value)); - Non-lock-free atomic
Lines 394 to 407 in 12c684b
inline void _Atomic_lock_spinlock(long& _Spinlock) noexcept { while (_InterlockedExchange(&_Spinlock, 1)) { _YIELD_PROCESSOR(); } } inline void _Atomic_unlock_spinlock(long& _Spinlock) noexcept { #if defined(_M_ARM) || defined(_M_ARM64) _Memory_barrier(); __iso_volatile_store32(reinterpret_cast<int*>(&_Spinlock), 0); _Memory_barrier(); #else // ^^^ ARM32/ARM64 hardware / x86/x64 hardware vvv _InterlockedExchange(&_Spinlock, 0); #endif // hardware - Parallel algorithms in
<execution>(more than just this occurrence):
Line 3624 in 12c684b
_State.store(_New_state); memory_resource.cpp
STL/stl/src/memory_resource.cpp
Line 24 in 12c684b
memory_resource* const _Temp = __crt_interlocked_read_pointer(&_Default_resource);
STL/stl/src/memory_resource.cpp
Line 33 in 12c684b
memory_resource* const _Temp = __crt_interlocked_read_pointer(&_Default_resource);
STL/stl/src/memory_resource.cpp
Line 43 in 12c684b
memory_resource* const _Temp = __crt_interlocked_exchange_pointer(&_Default_resource, _Resource);
STL/stl/src/memory_resource.cpp
Line 53 in 12c684b
memory_resource* const _Temp = __crt_interlocked_exchange_pointer(&_Default_resource, _Resource); filesystem.cpp
Lines 36 to 50 in 12c684b
auto _Result = __crt_interlocked_read_pointer(_Cache); if (_Result) { return _Result; } const HMODULE _HMod = GetModuleHandleW(_Module); if (_HMod) { _Result = reinterpret_cast<_Fn_ptr>(GetProcAddress(_HMod, _Fn_name)); } if (!_Result) { _Result = _Fallback; } __crt_interlocked_exchange_pointer(_Cache, _Result);
Some places believed to be not affected by the issue still use acquire / release, specifically:
shared_ptrexternal spinlock:
Lines 16 to 34 in 12c684b
_CRTIMP2_PURE void __cdecl _Lock_shared_ptr_spin_lock() { // spin until _Shared_ptr_flag successfully set #ifdef _M_ARM while (_InterlockedExchange_acq(&_Shared_ptr_flag, 1)) { __yield(); } #else // _M_ARM while (_interlockedbittestandset(&_Shared_ptr_flag, 0)) { // set bit 0 } #endif // _M_ARM } _CRTIMP2_PURE void __cdecl _Unlock_shared_ptr_spin_lock() { // release previously obtained lock #ifdef _M_ARM __dmb(_ARM_BARRIER_ISH); __iso_volatile_store32(reinterpret_cast<volatile int*>(&_Shared_ptr_flag), 0); #else // _M_ARM _interlockedbittestandreset(&_Shared_ptr_flag, 0); // reset bit 0 #endif // _M_ARM } <system_error>
Lines 590 to 597 in 12c684b
if (_Storage[0].load(memory_order_acquire) != 0) { return reinterpret_cast<_Ty&>(_Storage); } const _Ty _Target; const auto _Target_iter = reinterpret_cast<const uintptr_t*>(_STD addressof(_Target)); _CSTD memcpy(_Storage + 1, _Target_iter + 1, sizeof(_Ty) - sizeof(uintptr_t)); _Storage[0].store(_Target_iter[0], memory_order_release); atomic_wait.cpp
Lines 147 to 152 in 12c684b
_Wait_functions._Api_level.store(_Level, _STD memory_order_release); return _Level; } [[nodiscard]] __std_atomic_api_level _Acquire_wait_functions() noexcept { auto _Level = _Wait_functions._Api_level.load(_STD memory_order_acquire);
The task is to confirm the situation with compiler team and decide on using memory_order_acquire / memory_order_release in mentioned and possibly unmentioned preexisting code and new code
Note also that memory model implementation on arm may change in the future, see #83 , see also ##488 , #775 , #1082