Multi-thread state visibility in Java: is there a way to turn the JVM into the worst case scenario?

Question 1

jcstress maintainer here. There are multiple ways to answer that question.

The easiest solution would be wrapping the getter in the loop, and let JIT hoist it. This is allowed for non-volatile field reads, and simulates the visibility failure with compiler optimization.
More sophisticated trick involves getting the debug build of OpenJDK, and using -XX:+StressLCM -XX:+StressGCM, effectively doing the instruction scheduling fuzzing. Chances are the load in question will float somewhere you can detect with the regular tests your product has.
I am not sure if there is practical hardware holding the written value long enough opaque to cache coherency, but it is somewhat easy to build the testcase with jcstress. You have to keep in mind that the optimization in (1) can also happen, so we need to employ a trick to prevent that. I think something like this should work.

Question 2

It would be great to have a Java compiler that would intentionally perform as many weird (but allowed) transfirmations as possible to be able to break thread unsafe code more easily, like Csmith for C. Unfortunately, such a compiler does not exist (as far as I know).

In the meantime, you can try the jcstress library* and exercise your code on several architectures, if possible with weaker memory models (i.e. not x86) to try and break your code:

The Java Concurrency Stress tests (jcstress) is an experimental harness and a suite of tests aid research in the correctness of concurrency support in the JVM, class libraries, and hardware.

But in the end, unfortunately, the only way to prove that a piece of code is 100% correct is code inspection (and I don't know of a static code analysis tool able to detect all race conditions).

_{*I have not used it and I am unclear which of jcstress and the java-concurrency-torture library is more up to date (I would suspect jcstress).}

Question 3

Not on a real machine, sadly testing multi-threaded code will remain difficult.

As you say, the hardware will clear the second level cache and the JVM has no control over that. The JSL only specifies what must happen and this is a case where B might never see the updated value of value.

The only way to force this to happen on a real machine is to alter the code in such a way to void your testing strategy i.e. you end up testing different code.

However, you might be able to run this on a simulator that simulates hardware that doesn't clear the second level cache. Sounds like a lot of effort though!

Question 4

I think you are refering to the principle called "false sharing" where different CPUs must synchronize their caches or else face the possibility that data such as you describe could become mismatched. There is a very good article on false sharing on Intel's website. Intel describes some useful tools in their article for diagnosing this problem. This is a relevant quote:

The primary means of avoiding false sharing is through code inspection. Instances where threads access global or dynamically allocated shared data structures are potential sources of false sharing. Note that false sharing can be obscured by the fact that threads may be accessing completely different global variables that happen to be relatively close together in memory. Thread-local storage or local variables can be ruled out as sources of false sharing.

Although methods described in the article are not what you have asked for (forcing worst-case behavior from the JVM), as already stated this isn't really possible. The methods described in this article are the best way I know to try to diagnose and avoid false sharing.

There are other resources addressing this problem around the web. For example, this article has a suggestion for a way to avoid false sharing in Java. I have not tried this method, so I cannot vouch for it, but I think the author's idea is sound. You might consider trying out his suggestion.

Question 5

I have previously suggested a worst case behaving JVM for test purposes on the memory model list but the idea didn't seem popular.

So how to gain "worst case JVM behaviour" , with existing technology i.e how can I test the scenario in the question and get it to fail EVERY time. You could try to find the setup with the weakest memory model possible but that's unlikely to be perfect.

What I have often considered is using a distributed JVM something similar to how I believe Terracotta works under the cover so your application now runs on multiple JVM's (either remote or local) (threads in the same application run in different instances). In this setup inter JVM thread communication takes place at memory barriers e.g. the synchronized keywords you are missing in bugged code for instance (it conforms to the Java Memory Model) and the application is configured i.e. you say this class thread runs here . No code change required to your tests just configuration, any well ordered java application should run out of the box, however this setup would be very intolerant of a badly ordered application (normally a problem ... now an asset i.e. the Memory model exhibits very weak but legal behavior). In the example above loading the code onto a cluster, if two threads run on different nodes setValue has no effect visible to the other thread unless the code was changed and synchronized, volatile etc etc were used, then the code works as intended.

Now your test for the example above (configured correctly) would fail every time without correct "happens before ordering" which is potentially very useful for tests. The flaw in the plan for complete coverage you would need a potentially a node per application thread (can be same machine or multiple in a cluster) or multiple test runs. If you have 1000's of threads then that could be prohibitive though hopefully they would be pooled and scaled down for E2E test scenarios or run it in a cloud. If nothing else this kind of setup might be useful in demonstrating the issue.

inter thread communication across JVMs

Question 6

The example you have given is described as Incorrectly Synchronized in http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4. I think this is always incorrect and will lead to bugs sooner or later. Most of the times later :-).

To find such incorrectly synchronized code blocks, I use the following algorithm:

Record the threads for all field modifications using instrumentation. If a field is modified by more than one thread without synchronization, I have found a data race.

I implemented this algorithm inside http://vmlens.com, which is a tool to find data races inside java programs.

Question 7

Here's a simple way: just comment out the code for setValue. You can uncomment it after testing. Since in many cases like this a mechanism is needed to fake failures, it would be a good idea to build a general mechanism for all such cases.