Question

On the Internet, there can be found many debates about the use of volatile keyword in parallel programming, sometimes with contradictory argumentation.

One of the more trustworthy discussion of this topic seems to be this article by Arch Robison. The example he is using is the task of passing a value from one thread to another:

Thread 1. computes a matrix product and gives it to Thread 2, which does something other with it. The matrix is variable M and the flag is a volatile pointer R.

  1. Thread 1 multiplies computes a matrix product M and atomically sets R to point to M.
  2. Thread 2 waits until R!=NULL and then uses M as a factor to compute another matrix product.

In other words, M is a message and R is a ready flag.

The author is claiming, that while declaring R as a volatile will solve the issue with propagating the change from Thread 1 to Thread 2, it makes no guarantees about what the value of M will be when this happens. And the assignments to R and M can be reordered. So we need to make both M and R volatile or use some synchronization mechanism in some library like pthreads.

My question is, how to do the following in C

1) How to share a single flag between two threads - How to atomically assign to it, make sure the other thread will see the change and test for the change in the other thread. Is the use of volatile legitimate in this case? Or can some library provide a conceptually better or faster way, probably involving memory barriers?

2) How to do the Robison's example right, so how to send the matrix M from one thread to the other and do it safely (and preferably portably with pthreads)

Was it helpful?

Solution

"volatile" is a hint for the compiler not to optimize the memory access, i.e., do not assume that a value in memory is unchanged since the last (local) write. Without this hint, a compiler could assume that a value of a register, where the variable is copied from, is still valid. Thus, while it is rather unlikely that a matrix is kept within a register, in general both variables should be volatile, or more precisly, volatile for the receiver.

In real life multithreading, one would rather use a semaphore or something like for the signaling, avoiding busy waiting on receiver.

OTHER TIPS

Under architectures like x86, a properly aligned (and sized) variable like a pointer will by default be read from and written to atomically, but what needs to happen is a serialization of memory read/writes to prevent reordering in the CPU pipeline (via use of an explicit memory fence or bus locking operation) as well as the use of volatile to prevent the compiler reordering the code it generates.

The easiest way to do this is to use CAS. most CAS intrinsics provide a full memory barrier at compiler and CPU memory bus level. under MSVC, you can use the Interlock* functions, BTS, BTR, Inc, Dec, Exchange and Add would all work for a flag, for GCC you'd use the __sync_* based variants.

For more portable options you could use a pthread_mutex or pthread_cond. if you can use C11 you can also look into the _Atomic keyword.

The 'classic' way is for Thread 1 to push the pointer to the dynamically-allocated matrix onto a producer-consumer queue upon which Thread 2 is waiting. Once pushed, Thread 1 can allocate another M and start working on it, if it so wishes.

Fiddling around with volatile flags etc. as an optimization may be premature if the overall performance is dominated by operations on large matrices.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top