Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit a7de60e

Browse files
authored
Fix reduce_blocks_into_lanes race condition (#1798)
* move __sync_threads() outside if branch * add clarifying comment
1 parent f3f0492 commit a7de60e

1 file changed

Lines changed: 2 additions & 1 deletion

File tree

csrc/type_shim.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -362,8 +362,9 @@ __device__ __forceinline__ T reduce_block_into_lanes
362362
if(tid < lanes)
363363
x[tid] = final; // EpilogueOp
364364
// Make sure the smem result is visible to all warps.
365-
__syncthreads();
366365
}
366+
__syncthreads();
367+
// Avoid potential write before read race when reduce_block_into_lanes is called back to back
367368

368369
return final;
369370
}

0 commit comments

Comments
 (0)