Question

There are several questions on Stack Overflow along the lines of "why can't I initialise static data members in-class in C++". Most answers quote from the standard telling you what you can do; those that attempt to answer why usually point to a link (now seemingly unavailable) [EDIT: actually it is available, see below] on Stroustrup's site where he states that allowing in-class initialisation of static members would violate the One Definition Rule (ODR).

However, these answers seem overly simplistic. The compiler is perfectly able to sort out ODR problems when it wants to. For example, consider the following in a C++ header:

struct SimpleExample
{
    static const std::string str;
};

// This must appear in exactly one TU, not a header, or else violate the ODR
// const std::string SimpleExample::str = "String 1";

template <int I>
struct TemplateExample
{
    static const std::string str;
};

// But this is fine in a header
template <int I>
const std::string TemplateExample<I>::str = "String 2";

If I instantiate TemplateExample<0> in multiple translation units, compiler/linker magic kicks in and I get exactly one copy of TemplateExample<0>::str in the final executable.

So my question is, given that it's obviously possible for the compiler to solve the ODR problem for static members of template classes, why can it not do this for non-template classes too?

EDIT: The Stroustrup FAQ response is available here. The relevant sentence is:

However, to avoid complicated linker rules, C++ requires that every object has a unique definition. That rule would be broken if C++ allowed in-class definition of entities that needed to be stored in memory as objects

It seems however that those "complicated linker rules" do exist and are used in the template case, so why not in the simple case too?

Was it helpful?

Solution

OK, this following example code demonstrates the difference between a strong and weak linker reference. After I will try to explain why changing between the 2 can alter the resulting executable created by a linker.

prototypes.h

class CLASS
{
public:
    static const int global;
};
template <class T>
class TEMPLATE
{
public:
    static const int global;
};

void part1();
void part2();

file1.cpp

#include <iostream>
#include "template.h"
const int CLASS::global = 11;
template <class T>
const int TEMPLATE<T>::global = 21;
void part1()
{
    std::cout << TEMPLATE<int>::global << std::endl;
    std::cout << CLASS::global << std::endl;
}

file2.cpp

#include <iostream>
#include "template.h"
const int CLASS::global = 21;
template <class T>
const int TEMPLATE<T>::global = 22;
void part2()
{
    std::cout << TEMPLATE<int>::global << std::endl;
    std::cout << CLASS::global << std::endl;
}

main.cpp

#include <stdio.h>
#include "template.h"
void main()
{
    part1();
    part2();
}

I accept this example is totally contrived, but hopefully it demonstrates why 'Changing strong to weak linker references is a breaking change'.

Will this compile? No, because it has 2 strong references to CLASS::global.

If you remove one of the strong references to CLASS::global, will it compile? Yes

What is the value of TEMPLATE::global?

What is the value of CLASS::global?

The weak reference is undefined because it depends on the link order, which makes it obscure at best and depending on the linker uncontrollable. This is probably acceptable because it is uncommon not to keep all of the template in a single file, because both prototype and implementation are required together for compilation to work.

However, for Class Static Data Members as they were historically strong references, and not definable within the declaration, it was the rule, and now at least common practice to have the full data declaration with the strong reference in the implementation file.

In fact, because of the linker producing ODR link errors for violations of strong references, it was common practice to have multiple object files (compilation units to be linked), that were linked conditionally to alter behaviour for different hardware and software combinations and sometimes for optimization benefits. Knowing if you made a mistake in your link parameters, you would get an error either saying you had forgotten to select a specialization (no strong reference), or had selected multiple specializations (multiple strong references)

You need to remember at the time of the introduction of C++, 8 bit, 16 bit and 32 bit processors were all still valid targets, AMD and Intel had similar but different instruction sets, hardware vendors preferred closed private interfaces to open standards. And the build cycle could take hours, days, even a week.

OTHER TIPS

The C++ Build structure used to be quite simple.

The compiler built object files which normally contained one class implementation. The linker then joined all of the object files together into the executable file.

The One Definition Rule refers to the requirement that each variable (and function) used in the executable only appears in one object file created by the compiler. All other object files simply have a external prototype references to the variable/function.

Templates where a very late addition to C++, and require that all the template implementation details are available during each compilation of every object, so that the compiler can do all of it's optimizations - this involves lots of inlining and even more name mangling.

I hope this answers your question, because it the reason for the ODR rule, and why it doesn't affect templates. Because the linker has almost nothing to do with templates, they are all managed by the compiler. Excluding the case were use template specialization to push an entire template expansion into one object file, so it can be used in other object files, if they only see the prototypes for the template.

Edit:

Back in the olden days linkers frequently linked object files created with different languages. It was common to link ASM and C, and even after C++ some of that code was still used and that absolutely necessitates the ODR. Just because your project is only linking C++ files doesn't mean that's all a linker can do, and so it won't be changed because most projects are now solely C++. Even now many device drivers use the linker according to it's more original intention.

Answer:

It seems however that those "complicated linker rules" do exist and are used in the template case, so why not in the simple case too?

The compiler manages the template cases, and just creates weak linker references.

The linker has nothing to do with templates, they are templates used by the compiler to create code it passes to the linker.

So the linker rules are not effected by templates, but the linker rules are still important because ODR is a requirement of ASM and C, which the linker still links, and people other than you do still actually use.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top