I have standardized the locking texture to be img0
.
Lock Type 1:
Thread warps have a shared program counter. If a single thread grabs the lock, the other threads in the warp will still be stuck in the loop. In practice, this compiles but results in a deadlock.
Examples: StackOverflow, OpenGL.org
while (imageAtomicExchange(img0,coord,1u)==1u);
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
Lock Type 2:
To work around the issue of type 1, one instead writes conditionally. In the below, I have sometimes written the loop as a do-while loop, but a while loop doesn't work correctly either.
Lock Type 2.1:
The first thing one tries is a simple loop. Apparently due to buggy optimizations, this can result in a crash (I haven't tried recently).
Example: NVIDIA
bool have_written = false;
while (true) {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
break;
}
}
Lock Type 2.2:
The above example uses imageAtomicExchange(...)
, which might not be the first thing one tries. The most intuitive is imageAtomicCompSwap(...)
. Unfortunately, this doesn't work due to buggy optimizations. It (should be) otherwise sound.
Example: StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicCompSwap(img0,coord,0u,1u)==0u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);
Lock Type 2.3:
Switching back from imageAtomicCompSwap(...)
to imageAtomicExchange(...)
is the other common variant. The difference with 2.1 is the way the loop is terminated. This doesn't work correctly for me.
Examples: StackOverflow, StackOverflow
bool have_written = false;
do {
bool can_write = (imageAtomicExchange(img0,coord,1u)!=1u);
if (can_write) {
//<critical section>
memoryBarrier();
imageAtomicExchange(img0,coord,0);
have_written = true;
}
} while (!have_written);