Question

The idea is to get a value ("data") from

  • a variable
  • if doesn't exist, then from a file
  • if doesn't exist, then from a server
public static string LoadData(int id)
{
    if (_isDataLoaded[id - 1])
    {
      return _data[id - 1];
    }

    if (File.Exists(_fileName))
    {
        bool dataExists;
        string cachedData;
        if (GetCachedConstant(id, out cachedData)) //read from a file
        {
            lock (_locker)
            {
                _isDataLoaded[id - 1] = true;
                _data[id - 1] = cachedData;
            }

            return _data[id - 1];
        }
    }

    string remoteData = GetFromServer(id);
    lock (_locker)
    {
        CacheToFile(id, remoteData); //write to a file
        _data[id - 1] = remoteData;
        _isDataLoaded[id - 1] = true;
    }

    return _data[id - 1];
}

This code is used by many threads. Although it seems to be thread safe, actually it isn't. I tested it, thus, that gives me that idea and makes me sure. There should be double check of existing _data before writing. For example, it should be like double check used in pattern Singleton usually.

Someone please let me understand how do I implement it here?

EDIT:

string[] _data;
bool[] _isDataLoaded.

EDIT2:

The code above might be used with .NET < 4.0, so it's not allowed to use Lasy there. The only question I have for now, is should I use volatile in case I would use double-checked locking?

volatile string[] _data;
volatile bool[] _isDataLoaded.
Was it helpful?

Solution

I can see two obvious sources of threading problems

  • What type is _data and _isDataLoaded? If these types are not thread safe then neither is your code arrays are only thread safe if no two threads access the same element, which could happen in the above code. Try using a ConcurrentDictionary instead.
  • Depending on the exact implementation of GetCachedConstant and CacheToFile there is a potential race condition whereby thread A could begin writing cached data to a file causing thread B to go down the File.Exists route when in fact the file only contains partially written data

I'd guess the culprit is the first of these two - there might be other problems but unless these types are thread safe no amount of double checked locking will save you unless you also synchronize access to those objects.

OTHER TIPS

I'd replace _data with a Lazy<string>[], where the fetching from file&network happens in the delegate you pass in.

So in the static constructor do something like:

_data=new Lazy<string>[maxId+1];
for(int i=0;i<_data.Length;i++)
{
  _data[i]=new Lazy<string>(()=>fetchData(i), LazyThreadSafetyMode.ExecutionAndPublication);
}

and then to get the value simply _data[i].Value.


If you really want double checked locking, it basically works like this:

if (!_isDataLoaded[id - 1])
{
    lock(locker)
    {
        if(!_isDataLoaded[id - 1])
        {
            ...
            _data[id - 1] = ...;
            _isDataLoaded[id - 1] = true;
        }
    }
}
return _data[id - 1];

The problem with this code is that it's difficult to figure out if it's actually guaranteed to work. This depends on the memory model of the platform you're running on. AFAIK the .net 2.0 memory model guarantees that this works, but the ECMA CLR model and the java model don't. Memory model issues are very subtle and easy to get wrong. So I strongly recommend not using this pattern.

Why don't you lock at the beginning of the method. That assures you that your data (cache) will always be in a valid/consistent state.

how about this, keep a single lock to load/persist data from/to the cache

public static string LoadData(int id)
{
    if (_isDataLoaded[id - 1])
      return _data[id - 1];

    lock (_locker)
    {
        if (_isDataLoaded[id - 1])
          return _data[id - 1];

        if (File.Exists(_fileName))
        {
            bool dataExists;
            string cachedData;
            if (GetCachedConstant(id, out cachedData)) //read from a file
            {
                _data[id - 1] = cachedData;
                _isDataLoaded[id - 1] = true;
                return _data[id - 1];
            }
        }

        string remoteData = GetFromServer(id);
        CacheToFile(id, remoteData); //write to a file
        _data[id - 1] = remoteData;
        _isDataLoaded[id - 1] = true;
        return _data[id - 1];
    }
}
    lock (_locker)
    {
        _isDataLoaded[id - 1] = true;
        _data[id - 1] = cachedData;
    }

Here _isDataLoaded is set before the _data, therefore there's a race with someone seeing _isDataLoaded and reading from _data before _data is initialized.

But reversing it does not solve the problem. There's no guarantee that another thread will see the assignments in the same order, because the reader does not use any lock nor memory barriers.

Check Wikipedia about proper way to do it with C#. http://en.wikipedia.org/wiki/Double-checked_locking

The idea behind double-check locking is in its name. You check your condition before lock (like in your code) but also inside the lock block again, to make sure that another thread didn't change the state (i.e. result of the condition) while the current thread was between the (successful) check and the lock statement.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top