Question

The fallowing clause is taken from jetbrains.net After reading this and some other articles on the web, I still don't understand how is it possible to return null, after the first thread go in to the lock. Some one that does understand it can please help me and explain it in more humanized way?

"Consider the following piece of code:

public class Foo
{
  private static Foo instance;
  private static readonly object padlock = new object();

  public static Foo Get()
  {
    if (instance == null)
    {
      lock (padlock)
      {
        if (instance == null)
        {
          instance = new Foo();
        }
      }
    }
    return instance;
  }
};

Given the above code, writes that initialize the Foo instance could be delayed until the write of the instance value, thus creating the possibility that the instance returns an object in an unitialized state.

In order to avoid this, the instance value must be made volatile. "

Was it helpful?

Solution

Returning null is not the issue. The issue is that the new instance may be in a partially constructed state as perceived by another thread. Consider this declaration of Foo.

class Foo
{
  public int variable1;
  public int variable2;

  public Foo()
  {
    variable1 = 1;
    variable2 = 2;
  }
}

Here is how the code could get optimized by the C# compiler, JIT compiler, or hardware.1

if (instance == null)
{
  lock (padlock)
  {
    if (instance == null)
    {
      instance = alloc Foo;
      instance.variable1 = 1; // inlined ctor
      instance.variable2 = 2; // inlined ctor
    }
  }
}
return instance;

First, notice that the constructor is inlined (because it was simple). Now, hopefully it is easy to see that instance gets assigned the reference before its constituent fields get initialized inside the constructor. This is a valid strategy because reads and writes are free to float up and down as long as they do not pass the boundaries of the lock or alter the logical flow; which they do not. So another thread could see instance != null and attempt to use it before it is fully initialized.

volatile fixes this issue because it treats reads as an acquire fence and writes as a release fence.

  • acquire-fence: A memory barrier in which other reads & writes are not allowed to move before the fence.
  • release-fence: A memory barrier in which other reads & writes are not allowed to move after the fence.

So if we mark instance as volatile then the release-fence will prevent the above optimization. Here is how the code would look with the barrier annotations. I used an ↑ arrow to indicate a release-fence and a ↓ arrow to indicate an acquire-fence. Notice that nothing is allowed to float down past an ↑ arrow or up past an ↓ arrow. Think of the arrow head as pushing everything away.

var local = instance;
↓ // volatile read barrier
if (local == null)
{
  var lockread = padlock;
  ↑ // lock full barrier
  lock (lockread)
  ↓ // lock full barrier
  {
    local = instance;
    ↓ // volatile read barrier
    if (local == null)
    {
      var ref = alloc Foo;
      ref.variable1 = 1; // inlined ctor
      ref.variable2 = 2; // inlined ctor
      ↑ // volatile write barrier
      instance = ref;
    }
  ↑ // lock full barrier
  }
  ↓ // lock full barrier
}
local = instance;
↓ // volatile read barrier
return local;

The writes to the constituent variables of Foo could still be reordered, but notice that the memory barrier now prevents them from occurring after the assignment to instance. Using the arrows as a guide imagine various different optimization strategies that are allowed and disallowed. Remember that no reads or writes are allowed to float down past an ↑ arrow or up past an ↓ arrow.

Thread.VolatileWrite would have solved this problem as well and could be used in languages without a volatile keyword like VB.NET. If you take a look at how VolatileWrite is implemented you would see this.

public static void VolatileWrite(ref object address, object value)
{
  Thread.MemoryBarrier();
  address = value;
}

Now this may seem counter intuitive at first. Afterall, the memory barrier is placed before the assignment. What about getting the assignment committed to main memory you ask? Would it not be more correct to place the barrier after the assignment? If that is what your intuition is telling you then it is wrong. You see memory barriers are not strictly about getting a "fresh read" or a "committed write". It is all about instruction ordering. This is by far the biggest source of confusion I see.

It might also be important to mention that Thread.MemoryBarrier actually generates a full-fence barrier. So if I were to use my notation above with the arrows then it would look like this.

public static void VolatileWrite(ref object address, object value)
{
  ↑ // full barrier
  ↓ // full barrier
  address = value;
}

So technically calling VolatileWrite does more than what a write to a volatile field would do. Remember that volatile is not allowed in VB.NET for example, but VolatileWrite is apart of the BCL so it can be used in other languages.


1This optimization is mostly theoretical. The ECMA specification does technically allow for it, but the Microsoft CLI implementation of the ECMA specification treats all writes as if they had release fence semantics already. It is possible that another implementation of the CLI could still perform this optimization though.

OTHER TIPS

Bill Pugh has written several articles on the subject, and is a reference on the topic.

A notable reference is, The "Double-Checked Locking is Broken" Declaration.

Roughly speaking, here is the problem:

In mutlicore VM, writes by a thread might not be visible to other thread until a synchronization barrier (or memory fences) is reached. You can read "Memory Barriers: a Hardware View for Software Hackers" it's a really good article on the matter.

So, if a thread initializes an object A with one field a, and stores the reference of the object in the field ref of another object B, we have two "cells" in memory: a, and ref. Changes to both memory locations might not become visible to other threads at the same time unless the threads forces the visiblity of the changes with a memory fence.

In java, synchronization can be forced with synchronized. This is expensive, and an alternative it to declare a field as volatile in which case the change to this cell is always visible to all threads.

BUT, the semantics of volatile change between Java 4 and 5. In Java 4, you need to define both a, and ref as volatile, for the doulbe check to work in the example I described.

It was not intuitive, and most people would only set ref as volatile. So they change this and in Java 5+, if a volatile field is modified (ref) it triggers the synchronization of other fields modified (a).

EDIT: I only see now that you ask for C#, not Java... I leave my answer because maybe it's useful nevertheless.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top