문제

I am surprised to see from pstack that this code leads to deadlock! I don't see a reason for the same.

pthread_mutex_t lock;

_Cilk_for (int i = 0; i < N; ++i) {
  int ai = A[i];
  if (ai < pivot) {
    pthread_mutex_lock(&lock);
    A[ia++] = ai;
    pthread_mutex_unlock(&lock);
  }
  else if (ai > pivot) {
    pthread_mutex_lock(&lock);
    A[ib++] = ai;
    pthread_mutex_unlock(&lock);
  }
  else {
    pthread_mutex_lock(&lock);
    A[ic++] = ai;
    pthread_mutex_unlock(&lock);
  }
}

I am just using mutexes to make sure that access to A is atomic and serialized.

  • What is wrong with this code to lead to deadlock?
  • Is there a better way to implement this?
도움이 되었습니까?

해결책

If that's code inside a function, then you're not initialising the mutex correctly. You need to set it to PTHREAD_MUTEX_INITIALIZER (for a simple, default mutex) or do a pthread_mutex_init() on it (for more complex requirements). Without proper initialisation, you don't know what state the mutex starts in - it may well be in a locked state simply because whatever happened to be on the stack at that position looked like a locked mutex.

That's why it always needs to be initialised somehow, so that there is no doubt of the initial state.

Another potential problem you may have is this:

int ai = A[i];

You probably should protect that access with the same mutex since otherwise you may read it in a "half-state" (when another thread is only part way through updating the variable).


And, I have to say, I'm not sure that threads are being used wisely here. The use of mutexes is likely to swamp a statement like A[ia++] = ai to the point where the vast majority of time will be spent locking and unlocking the mutex. They're generally more useful where the code being processed during the lock is a little more substantial.

You may find a non-threaded variant will blow this one out of the water (but, of course, don't take my word for it - my primary optimisation mantra is "measure, don't guess").

다른 팁

Your pthread_mutex_t lock is not properly initialized, so, since it is a local variable, it may contain garbage, and might be in a strangely locked state. You should call pthread_mutex_init or initialize your lock with PTHREAD_MUTEX_INITIALIZER

As others complained, you are not wisely using mutexes. The critical sections of your code are much too small.

AFTER you fix or otherwise verify that you are in fact initializing your lock:

pstack may be privy to control mechanisms introduced by _Cilk_for that are interfering with what would otherwise be reasonable pthread code.

A quick search shows there are mutex solutions for use with Cilk - intermixing Cilk and pthreads isn't mentioned. It does look like Cilk is a layer on top of pthreads - so if Cilk chose to put a wrapper around mutex, they likely did so for a good reason. I'd suggest staying with the Cilk API.

That aside, there's a more fundamental issue with your algorithm. In your case, the overhead for creating parallel threads and synchronizing them likely dwarfs the cost of executing the code in the body of the for-loop. It's very possible this will run faster without parallelizing it.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top