Is it safe to use __syncthreads()
in a block where I have purposefully dropped threads using return
?
The documentation states that __syncthreads()
must be called by every thread in the block or else it will lead to a deadlock, but in practice I have never experienced such behavior.
Sample code:
__global__ void kernel(float* data, size_t size) { // Drop excess threads if user put too many in kernel call. // After the return, there are `size` active threads. if (threadIdx.x >= size) { return; } // ... do some work ... __syncthreads(); // Is this safe? // For the rest of the kernel, we need to drop one excess thread // After the return, there are `size - 1` active threads if (threadIdx.x + 1 == size) { return; } // ... do more work ... __syncthreads(); // Is this safe?}