Lazy initialized caching… how do I make it thread-safe?

https://stackoverflow.com/questions/8097439

27-02-2021
|

Question

that's what I have:

a Windows Service
- C#
- multithreaded
- the service uses a Read-Write-Lock (multiple reads at one time, writing blocks other reading/writing threads)
a simple, self-written DB
- C++
- small enough to fit into memory
- big enough not wanting to load it on startup (e.g. 10GB)
- read-performance is very important
- writing is less important
- tree structure
- informations held in tree nodes are stored in files
- for faster performance, the files are loaded only the first time they are used and cached
- lazy initialization for faster DB startup

As the DB will access those node informations very often (in the magnitude of several thousand times a second) and as I don't write very often, I'd like to use some kind of double checked locking pattern.

I know there is many questions about the double checked locking pattern here, but there seems to be so many different opinions, so I don't know what's the best for my case. What would you do with my setup?

Here's an example:

a tree with 1 million nodes
every node stores a list of key-value-pairs (stored in a file for persistence, file size magnitude: 10kB)
when accessing a node for the first time, the list is loaded and stored in a map (sth. like std::map)
the next time this node is accessed, I don't have to load the file again, I just get it from the map.
only problem: two threads are simultaneously accessing the node for the first time and want to write to the cache-map. This is very unlikely to happen, but it is not impossible. That's where I need thread-safety, which should not take too much time, as I usually don't need it (especially, once the whole DB is in memory).

Solution

About double checked locking:

class Foo
{
  Resource * resource;

  Foo() : resource(nullptr) { }
public:
  Resource & GetResource()
  {
    if(resource == nullptr)
    {
      scoped_lock lock(mutex); 
      if(resource == nullptr)
        resource = new Resource();
    }
    return *resource;
  }
}

It is not thread-safe as you check whether the address of resource is null. Because there is a chance that resource pointer is assigned to a non-null value right before the initializing the Resource object pointed to it.

But with the "atomics" feature of C++11 you may have a doubly checked locking mechanism.

class Foo
{
  Resource * resource;
  std::atomic<bool> isResourceNull;
public:
  Foo() : resource(nullptr), isResourceNull(true) { }

  Resource & GetResource()
  {
    if(isResourceNull.load())
    {
      scoped_lock lock(mutex); 
      if(isResourceNull.load())
      {
        resource = new Resoruce();
        isResourceNull.store(false);
      }
    }
    return *resource;
  }
}

EDIT: Without atomics

#include <winnt.h>

class Foo
{
  volatile Resource * resource;

  Foo() : resource(nullptr) { }
public:
  Resource & GetResource()
  {
    if(resource == nullptr)
    {
      scoped_lock lock(mutex); 
      if(resource == nullptr)
      {
        Resource * dummy = new Resource();
        MemoryBarrier(); // To keep the code order
        resource = dummy;  // pointer assignment
      }
    }
    return  *const_cast<Resource*>(resource);
  }
}

MemoryBarrier() ensures that dummy will be first created then assigned to resource. According to this link pointer assignments will be atomic in x86 and x64 systems. And volatile ensures that the value of resource will not be cached.

OTHER TIPS

Are you asking how to make reading the DB or reading the Nodes thread safe?

If you're trying to the latter and you're not writing very often, then why not make your nodes immutable, period? If you need to write something, then copy the data from the existing node, modify it and create another node which you can then put in your database.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow