The other answers, which say to use atomic
and not volatile
, are correct when portability matters. If you’re asking this question, and it’s a good question, that’s the practical answer for you, not, “But, if the standard library doesn’t provide one, you can implement a lock-free, wait-free data structure yourself!” Nevertheless, if the standard library doesn’t provide one, you can implement a lock-free data structure yourself that works on a particular compiler and a particular architecture, provided that there’s only one writer. (Also, somebody has to implement those atomic primitives in the standard library.) If I’m wrong about this, I’m sure someone will kindly inform me.
If you absolutely need an algorithm guaranteed to be lock-free on all platforms, you might be able to build one with atomic_flag
. If even that doesn’t suffice, and you need to roll your own data structure, you can do that.
Since there’s only one writer thread, your CPU might guarantee that certain operations on your data will still work atomically even if you just use normal accesses instead of locks or even compare-and-swaps. This is not safe according to the language standard, because C++ has to work on architectures where it isn’t, but it can be safe, for example, on an x86 CPU if you guarantee that the variable you’re updating fits into a single cache line that it doesn’t share with anything else, and you might be able to ensure this with nonstandard extensions such as __attribute__ (( aligned (x) ))
.
Similarly, your compiler might provide some guarantees: g++
in particular makes guarantees about how the compiler will not assume that the memory referenced by a volatile*
hasn’t changed unless the current thread could have changed it. It will actually re-read the variable from memory each time you dereference it. That is in no way sufficient to ensure thread-safety, but it can be handy if another thread is updating the variable.
A real-world example might be: the writer thread maintains some kind of pointer (on its own cache line) which points to a consistent view of the data structure that will remain valid through all future updates. It updates its data with the RCU pattern, making sure to use a release operation (implemented in an architecture-specific way) after updating its copy of the data and before making the pointer to that data globally visible, so that any other thread that sees the updated pointer is guaranteed to see the updated data as well. The reader then makes a local copy (not volatile
) of the current value of the pointer, getting a view of the data which will stay valid even after the writer thread updates again, and works with that. You want to use volatile
on the single variable that notifies the readers of the updates, so they can see those updates even if the compiler “knows” your thread couldn’t have changed it. In this framework, the shared data just needs to be constant, and readers will use the RCU pattern. That’s one of the two ways I’ve seen volatile
be useful in the real world (the other being when you don’t want to optimize out your timing loop).
There also needs to be some way, in this scheme, for the program to know when nobody’s using an old view of the data structure any longer. If that’s a count of readers, that count needs to be atomically modified in a single operation at the same time as the pointer is read (so getting the current view of the data structure involves an atomic CAS). Or this might be a periodic tick when all the threads are guaranteed to be done with the data they’re working with now. It might be a generational data structure where the writer rotates through pre-allocated buffers.
Also observe that a lot of things your program might do could implicitly serialize the threads: those atomic hardware instructions lock the processor bus and force other CPUs to wait, those memory fences could stall your threads, or your threads might be waiting in line to allocate memory from the heap.