Frage

The following program used the implementation of atomic locks from 'Cuda By Example', but running the program makes my machine frozen. Can someone tell me what's wrong with my program? Thanks a lot

Yifei

#include <stdio.h>


__global__ void test()
{
    __shared__ int i, mutex;

    if (threadIdx.x == 0) {
       i = 0;
       mutex = 0;
    }
    __syncthreads();

    while( atomicCAS(&mutex, 0, 1) != 0);
    i++;
    printf("thread %d: %d\n", threadIdx.x, i);
    atomicExch(&mutex,0);
}
War es hilfreich?

Lösung

Here is a theory. I hope that you are familiar with the concept of a warp. In the while loop all threads within a warp will enter the while loop. Only one will exit and the rest of the threads will reside inside the while loop. This will introduce a divergent branch making the thread that exited the while loop stall until the branch converges again. Because this thread is the only one that can release the mutex this will never happen because it waits for the other threads do converge.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top