Question

Causality in JMM seems to be the most confusing part of it. I have a few questions regarding JMM causality, and allowed behaviors in concurrent programs.

As I understand, the current JMM always prohibits causality loops. (Am I right?)

Now, as per the JSR-133 document, page 24, Fig.16, we have an example where:

Initially x = y = 0

Thread 1:

r3 = x;
if (r3 == 0)
    x = 42;
r1 = x;
y = r1;

Thread 2:

r2 = y;
x = r2;

Intuitively, r1 = r2 = r3 = 42 seems impossible. However, it is not only mentioned as possible, but also 'allowed' in JMM.

For the possibility, the explanation from the document which I fail to understand is:

A compiler could determine that the only values ever assigned to x are 0 and 42. From that, the compiler could deduce that, at the point where we execute r1 = x, either we had just performed a write of 42 to x, or we had just read x and seen the value 42. In either case, it would be legal for a read of x to see the value 42. It could then change r1 = x to r1 = 42; this would allow y = r1 to be transformed to y = 42 and performed earlier, resulting in the behavior in question. In this case, the write to y is committed first.

My question is, what kind of compiler optimization is it really? (I am compiler-ignorant.) Since 42 is written only conditionally, when the if statement is satisfied, how can the compiler decide to go with the writing of x?

Secondly, even if compiler does this speculative optimization, and commits y = 42 and then finally makes r3 = 42 , isn't it a violation of causality loop, since there is no cause and effect distinction left now?

In fact there is one example in the same document (page 15, Figure 7) where a similar causal loop is mentioned as unacceptable.

So how come this execution order is legal in JMM?

Was it helpful?

Solution

As explained, the only values ever written to x are 0 and 42. Thread 1:

r3 = x; // here we read either 0 or 42
if (r3 == 0)
  x = 42;  
// at this point x is definitely 42
r1 = x;

Therefore the JIT compiler can rewrite r1 = x as r1 = 42, and further y = 42. The point is, Thread 1 will always, unconditionally write 42 to y. The r3 variable is in fact redundant and could be completely eliminated from the machine code. So the code in the example only gives the appearance of a causal arrow from x to y, but detailed analysis shows that there is in fact no causality. The surprising consequence is that the write to y can be committed early.

A general note on optimization: I take it you are familiar with performance penalties involved in reading from the main memory. That is why the JIT compiler is bent on refusing to do it whenever possible, and in this example it turns out that it doesn't in fact need to read x in order to know what to write to y.

A general note on notation: r1, r2, r3 are local variables (they could be on the stack or in CPU registers); x, y are shared variables (these are in the main memory). Without taking this into account, the examples will not make sense.

OTHER TIPS

Compiler can perform some analyses and optimizations and end with following code for Thread1:

y=42; // step 1
r3=x; // step 2
x=42; // step 3

For single-threaded execution, this code is equivalent to the original code and so is legal. Then, if the code of Thread2 is executed between step 1 and step2 (which is well possible), then r3 is assigned 42 also.

The whole idea of this code sample is to demonstrate the need of proper synchronization.

Its is worth nothing that the javac doesn't optimise the code to a significant degree. The JIT optimises the code but is fairly conservative about re-ordering code. The CPU can re-order execution and it does this to small degree quite allot.

Forcing the CPU to not do instruction level optimisation is fairly expensive e.g. it can slow it down by a factor of 10 or more. AFAIK, the Java designers wanted to specify the minimum of guarantees needed which would work efficiently on most CPUs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top