Question

This post may seem overly long for just the short question at the end of it. But I also need to describe a design pattern I just came up with. Maybe it's commonly used, but I've never seen it (or maybe it just doesn't work :).

First, here's a code which (to my understanding) has undefined behavior due to "static initialization order fiasco". The problem is that the initialization of Spanish::s_englishToSpanish is dependent on English::s_numberToStr, which are both static initialized and in different files, so the order of those initializations is undefined:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static vector<string>* s_numberToStr;
    string m_str;

    explicit English(int number)
    {
        m_str = (*s_numberToStr)[number];
    }
};

File: English.cpp

#include "English.h"

vector<string>* English::s_numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
    vector<string> numberToStr;
    numberToStr.push_back("zero");
    numberToStr.push_back("one");
    numberToStr.push_back("two");
    return numberToStr;
}());

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

typedef map<string, string> MapType;

struct Spanish {
    static MapType* s_englishToSpanish;
    string m_str;

    explicit Spanish(const English& english)
    {
        m_str = (*s_englishToSpanish)[english.m_str];
    }
};

File: Spanish.cpp

#include "Spanish.h"

MapType* Spanish::s_englishToSpanish = new MapType( /*split*/
[]() -> MapType
{
    MapType englishToSpanish;
    englishToSpanish[ English(0).m_str ] = "cero";
    englishToSpanish[ English(1).m_str ] = "uno";
    englishToSpanish[ English(2).m_str ] = "dos";
    return englishToSpanish;
}());

File: StaticFiasco.h

#include <stdio.h>
#include <tchar.h>
#include <conio.h>

#include "Spanish.h"

int _tmain(int argc, _TCHAR* argv[])
{
    _cprintf( Spanish(English(1)).m_str.c_str() ); // may print "uno" or crash

    _getch();
    return 0;
}

To solve the static initialization order problem, we use the construct-on-first-use idiom, and make those static initializations function-local like so:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }
};

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }
};

But now we have another problem. Due to the function-local static data, neither of those classes is thread-safe. To solve this, we add to both classes a static member variable and an initialization function for it. Then inside this function we force the initialization of all the function-local static data, by calling once each function that has function-local static data. Thus, effectively we're initializing everything at the start of program, but still controlling the order of initialization. So now our classes should be thread-safe:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        English english(0); // Could the compiler ignore this line?
        return true;
    }
};
bool English::s_areStaticsInitialized = initializeStatics();

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        Spanish spanish( English(0) ); // Could the compiler ignore this line?
        return true;
    }
};

bool Spanish::s_areStaticsInitialized = initializeStatics();

And here's the question: Is it possible that some compiler might optimize away those calls to functions (constructors in this case) which have local static data? So the question is what exactly amounts to "having side-effects", which to my understanding means the compiler isn't allowed to optimize it away. Is having function-local static data enough to make the compiler think the function call can't be ignored?

Was it helpful?

Solution

Section 1.9 "Program execution" [intro.execution] of the C++11 standard says that

1 The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. ... conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.
...

5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.
...

8 The least requirements on a conforming implementation are:
— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.
— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.
These collectively are referred to as the observable behavior of the program.
...

12 Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

Also, in 3.7.2 "Automatic storage duration" [basic.stc.auto] it is said that

3 If a variable with automatic storage duration has initialization or a destructor with side effects, it shall not be destroyed before the end of its block, nor shall it be eliminated as an optimization even if it appears to be unused, except that a class object or its copy/move may be eliminated as specified in 12.8.

12.8-31 describes copy elision which I believe is irrelevant here.

So the question is whether the initialization of your local variables has side effects that prevent it from being optimized away. Since it can perform initialization of a static variable with an address of a dynamic object, I think it produces sufficient side effects (e.g. modifies an object). Also you can add there an operation with a volatile object, thus introducing an observable behavior which cannot be eliminated.

OTHER TIPS

Ok, in a nutshell:

  1. I cannot see why the static members of the class need to be public - they are implementation detail.

  2. Do not make them private but instead make them members of the compilation unit (where code that implements your classes will be).

  3. Use boost::call_once to perform the static initialisation.

Initialisation on first use is relatively easy to enforce the ordering of, it is the destruction that is far harder to perform in order. Note however that the function used in call_once must not throw an exception. Therefore if it might fail you should leave some kind of failed state and check for that after the call.

(I will assume that in your real example, your load is not something you hard-code in but more likely you load some kind of dynamic table, so you can't just create an in-memory array).

Why don't you just hide English::s_numberToStr behind a public static function and skip the constructor syntax entirely? Use DCLP to ensure thread-safety.

I strongly recommend avoiding class static variables whose initialization involves non-trivial side-effects. As a general design pattern, they tend to cause more problems than they solve. Whatever performance problems you're concerned about here needs justification because I'm doubtful that they are measurable under real-world circumstances.

maybe you need to do extra work to control the init order. like,

class staticObjects
{
    private:
    vector<string>* English::s_numberToStr;
    MapType* s_englishToSpanish;
};

static staticObjects objects = new staticObjects();

and then define some interfaces to retrieve it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top