Question

This question made me question a practice I had been following for years.

For thread-safe initialization of function-local static const objects I protect the actual construction of the object, but not the initialization of the function-local reference referring to it. Something like this:

namespace {
  const some_type& create_const_thingy()
  {
     lock my_lock(some_mutex);
     static const some_type the_const_thingy;
     return the_const_thingy;
  }
}

void use_const_thingy()
{
  static const some_type& the_const_thingy = create_const_thingy();

  // use the_const_thingy

}

The idea is that locking takes time, and if the reference is overwritten by several threads, it won't matter.

I'd be interested if this is

  1. safe enough in practice?
  2. safe according to The Rules? (I know, the current standard doesn't even know what "concurrency" is, but what about trampling over an already initialized reference? And do other standards, like POSIX, have something to say that's relevant to this?)

The reason I want to know this is that I want to know whether I can leave the code as it is or whether I need to go back and fix this.


For the inquiring minds:

Many such function-local static const objects I used are maps which are initialized from const arrays upon first use and used for lookup. For example, I have a few XML parsers where tag name strings are mapped to enum values, so I could later switch over the tags' enum values.


Since I got some answers as to what to do instead, but haven't got an answer to my actual questions (see 1. and 2. above), I'll start a bounty on this. Again:
I am not interested in what I could do instead, I do really want to know about this.

Was it helpful?

Solution

This is my second attempt at an answer. I'll only answer the first of your questions:

  1. safe enough in practice?

No. As you're stating yourself you're only ensuring that the object creation is protected, not the initialization of the reference to the object.

In absence of a C++98 memory model and no explicit statements from the compiler vendor, there are no guarantees that writing to the memory representing the actual reference and the writing to the memory that holds the value of the initialization flag (if that is how it is implemented) for the reference are seen in the same order from multiple threads.

As you also say, overwriting the reference several times with the same value should make no semantic difference (even in the presence of word tearing, which is generally unlikely and perhaps even impossible on your processor architecture) but there's one case where it matters: When more than one thread races to call the function for the first time during program execution. In this case it is possible for one or more of these threads to see the initialization flag being set before the actual reference is initialized.

You have a latent bug in your program and you need to fix it. As for optimizations I'm sure there are many besides using the double-checked locking pattern.

OTHER TIPS

Here is my take (if really you can't initialize it before threads are launched):

I've seen (and used) something like this to protect static initialization, using boost::once

#include <boost/thread/once.hpp>

boost::once_flag flag;

// get thingy
const Thingy & get()
{
    static Thingy thingy;

    return thingy;
}

// create function
void create()
{
     get();
}

void use()
{
    // Ensure only one thread get to create first before all other
    boost::call_once( &create, flag );

    // get a constructed thingy
    const Thingy & thingy = get(); 

    // use it
    thingy.etc..()          
}

In my understanding, this way all threads wait on boost::call_once except one that will create the static variable. It will be created only once and then will never be called again. And then you have no lock any more.

So, the relevant part of the spec is 6.7/4:

An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization.

Assuming the second part holds (object is initialized the first time control passes through its declaration), your code can be considered thread safe.

Reading through 3.6.2, it appears the early initialization permitted is converting dynamic-initialization to static-initialization. Since static-initialization must happen before any dynamic-initialization and since I can't think of any way to create a thread until you get to dynamic-initialization, such an early initialization would also guarantee the constructor would get called a single time.

Update

So, in respect to calling the some_type constructor for the_const_thingy, your code is correct according to the rules.

This leaves the issue about overwriting the reference which is definitely not covered by the spec. That said, if you are willing to assume that references are implemented via pointers (which I believe is the most common way to do that), then all you are going to do is overwrite a pointer with the value that it already holds. So my take is that this should be safe in practice.

I am not standardista...

But for the use you mention, why not simply initialize them before any thread is created ? Many Singletons issues are caused because people use the idiomatic "single thread" lazy initialization while they could simply instantiate the value when the library is loaded (like a typical global).

The lazy fashion only makes sense if you use this value from another 'global'.

On the other hand, another method I've seen was to use some kind of coordination:

  • 'Singleton' to be register their initialization method in a 'GlobalInitializer' object during library load time
  • 'GlobalInitializer' to be called in 'main' before any thread is launched

though I may not be describing it accurately.

In brief, I think that:

  • The object initialization is thread-safe, assuming that "some_mutex" is fully constructed when entering "create_const_thingy".

  • The initialization of the object reference inside "use_const_thingy" is not guaranteed to be thread-safe; it might (as you say) be subject of getting initialized multiple times (which is less of a problem), but it might also be subject to word tearing which could result in undefined behaviour.

[I assume that the C++ reference is implemented as a reference to the actual object using a pointer value, which could in theory be read when partially written to].

So, to try and answer your question:

  1. Safe enough in practice: Very likely, but ultimately depends on pointer size, processor architecture and code generated by the compiler. The crux here is likely to be whether a pointer-sized write/read is atomic or not.

  2. Safe according to the rule: Well, there are no such rules in C++98, sorry (but you knew that already).


Update: After posting this answer I realized that it only focuses on a small, esoteric part of the real problem, and because of this decided to post another answer instead of editing the contents. I'm leaving the contents "as-is" as it has some relevance to the question (and also to humble myself, reminding me to think through things a bit more before answering).

I've programmed enough interprocess sockets to have nightmares. In order to make anything thread-safe on a CPU with DDR RAM, you have to cache-line-align the data structure and pack up all of your global variables contiguously into as few cache lines as possible.

The problem with unaligned interprocess data and loosely packed globals are that it causes aliasing from cache misses. In CPUs that use DDR RAM, there are a (usually) a bunch of 64-byte cache lines. When you load a cache line, the DDR RAM will automatically load up a bunch more cache lines, but the first cache line is always the hottest. What happens with interrupts that occur at high speeds is that the cache page will act as a low-pass filter, just like in analog signals, and will filter out the interrupt data leading to COMPLETELY baffling bugs if you're not aware of whats going on. That same thing goes for global variables that are not packed up tightly; if it takes up multiple cache lines it will get out of sync unless you to take a snapshot of the critical interprocess variables and pass them through on the stack and the registers to ensure the data is synced up right.

The .bss section (i.e. where the global variables are stored, will get initialized to all zeros, but the compiler will not cache-line-align the data for you, you will have to do that yourself, which may also a good place to use the C++ Construct in Place. To learn the math behind fastest way to align pointers read this article; I'm trying to figure out if I came up with that trick. Here is what the code will look like:

inline char* AlignCacheLine (char* buffer) {
  uintptr_t offset = ((~reinterpret_cast<uintptr_t> (buffer)) + 1) & (63);
  return buffer + offset;
}

char SomeTypeInit (char* buffer, int param_1, int param_2, int param_3) {
  SomeType type = SomeType<AlignCacheLine (buffer)> (1, 2, 3);
  return 0xff;
}

const SomeType* create_const_thingy () {
  static char interprocess_socket[sizeof (SomeType) + 63],
              dead_byte = SomeTypeInit (interprocess_socket, 1, 2, 3);
  return reinterpret_cast<SomeType*> (AlignCacheLine (interprocess_socket));
}

In my experience, you will have to use a pointer, not a reference.

Just call the function before you start creating threads, thus guaranteeing the reference and the object. Alternatively, don't use such a truly terrible design pattern. I mean, why on earth have a static reference to a static object? Why even have static objects? There's no benefit to this. Singletons are a terrible idea.

This seems to be the easiest/cleanest approach I can think of without needing all of the mutex shananigans:

static My_object My_object_instance()
{
    static My_object  object;
    return object;
}

// Ensures that the instance is created before main starts and creates any threads
// thereby guaranteeing serialization of static instance creation.
__attribute__((constructor))
void construct_my_object()
{
    My_object_instance();
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top