Need clarification about Thread.MemoryBarrier() [duplicate]
-
28-02-2021 - |
Question
Possible Duplicate:
Why we need Thread.MemoryBarrier()?
From O'Reilly's C# in a Nutshell:
class Foo
{
int _answer;
bool _complete;
void A()
{
_answer = 123;
Thread.MemoryBarrier(); // Barrier 1
_complete = true;
Thread.MemoryBarrier(); // Barrier 2
}
void B()
{
Thread.MemoryBarrier(); // Barrier 3
if (_complete)
{
Thread.MemoryBarrier(); // Barrier 4
Console.WriteLine (_answer);
}
}
}
Suppose methods A and B ran concurrently on different threads:
The author says: "Barriers 1 and 4 prevent this example from writing “0”. Barriers 2 and 3 provide a freshness guarantee: they ensure that if B ran after A, reading _complete would evaluate to true."
My questions are:
- Why Barrier 4 is needed ? Barrier 1 isn't enough ?
- Why 2 & 3 are needed ?
- From what I understand, the barrier prevent executing instructions prior to its location after its following instructions, am I correct ?
Solution
Memory barrier enforces ordering constraint on reads and writes from/to memory: memory access operations before the barrier happen-before the memory access after the barrier.
Barriers 1 and 4 have complementary roles: barrier 1 ensures that the write to
_answer
happens-before the write to_complete
, while barrier 4 ensures that the read from_complete
happens-before the read from_answer
. Imagine barrier 4 isn't there, but barrier 1 is. While it is guaranteed that123
is written to_answer
beforetrue
is written to_complete
some other thread runningB()
may still have its read operations reordered and hence it may read_answer
before it reads_complete
. Similarly if barrier 1 is removed with barrier 4 kept: while the read from_complete
inB()
will always happen-before the read from_answer
,_complete
could still be written to before_answer
by some other thread runningA()
.Barriers 2 and 3 provide freshness guarantee: if barrier 3 is executed after barrier 2 then the state visible to the thread running
A()
at the point when it executes barrier 2 becomes visible to the thread runningB()
at the point when it executes barrier 3. In the absence of any of these two barriersB()
executing afterA()
completed might not see the changes made byA()
. In particular barrier 2 prevents the value written to_complete
from being cached by the processor runningA()
and forces the processor to write it out to the main memory. Similarly, barrier 3 prevents the processor runningB()
from relying on cache for the value of_complete
forcing a read from the main memory. Note however that stale cache isn't the only thing which can prevent freshness guarantee in the absence of memory barriers 2 and 3. Reordering of operations on the memory bus is another example of such mechanism.Memory barrier just ensures that the effects of memory access operations are ordered across the barrier. Other instructions (e.g. increment a value in a register) may still be reordered.
OTHER TIPS
Ok, here we go: A memory barrier prevents an optimizing compiler from reordering instructions. This means that no instruction before the barrier can be executed after an instruction that follows the barrier. There are several types of barriers but I will not go into details. Also, a CPU with weak memory ordering can reorder instructions and can create deadlocks. So:
- Barrier 4 is needed to make the thread running method B read the up-to-date value of _answer (i.e. reading 123 instead of 0). It can occur that, if you compile in Release mode, the compiler will optimize the code and reorder instructions such that it is possible for the thread running B to read 0, even though the instruction you wrote would logically make this impossible (since _answer is assigned before _complete).
- Barriers 2 & 3 also prevent reordering (as well as caching of the value of _complete) such that there is no way that the thread running B will ever read _complete as false, provided it ran after A.
- The answer is above.