Question

I've recently encountered what I think is a false-sharing problem in my application, and I've looked up Sutter's article on how to align my data to cache lines. He suggests the following C++ code:

// C++ (using C++0x alignment syntax)
template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] T data;
   char pad[ CACHE_LINE_SIZE > sizeof(T)
        ? CACHE_LINE_SIZE - sizeof(T)
        : 1 ];
};

I can see how this would work when CACHE_LINE_SIZE > sizeof(T) is true -- the struct cache_line_storage just ends up taking up one full cache line of memory. However, when the sizeof(T) is larger than a single cache line, I would think that we should pad the data by CACHE_LINE_SIZE - T % CACHE_LINE_SIZE bytes, so that the resulting struct has a size that is an integral multiple of the cache line size. What is wrong with my understanding? Why does padding with 1 byte suffice?

Was it helpful?

Solution

You can't have arrays of size 0, so 1 is required to make it compile. However, the current draft version of the spec says that such padding is unecessary; the compiler must pad up to the struct's alignment.

Note also that this code is ill-formed if CACHE_LINE_SIZE is smaller than alignof(T). To fix this, you should probably use [[align(CACHE_LINE_SIZE), align(T)]], which will ensure that a smaller alignment is never picked.

OTHER TIPS

Imagine

#define CACHE_LINE_SIZE 32
sizeof(T) == 48

Now, consider how [[ align(CACHE_LINE_SIZE) ]], works. eg:

[[ align(32) ]] Foo foo;

This will force sizeof(Foo) == 32n for some n. ie align() will pad for you, if necessary, in order for things like Foo foo[10]; to have each foo[i] aligned as requested.

So, in our case, with sizeof(T) == 48, this means sizeof(cache_line_storage<T>) == 64.

So the alignment gives you the padding you were hoping for.

However, this is one 'error' in the template. Consider this case:

#define CACHE_LINE_SIZE 32
sizeof(T) == 32

Here we end up with char pad[1];. Which means sizeof(cache_line_storage<T>) == 64. Probably not what you want!

I think the template would need to be modified somewhat:

template <typename T, int padding>
struct pad_or_not
{
   T data;
   char pad[padding];
};

// specialize the 0 case
// As it is late, I am SURE I've got the specialization syntax wrong...
template <typename T, int>
struct pad_or_not<0>
{
   T data;
};

template<typename T>
struct cache_line_storage {
   [[ align(CACHE_LINE_SIZE) ]] pad_or_not<T, (sizeof(T) > CACHE_LINE_SIZE ? 0 : CACHE_LINE_SIZE - sizeof(T) ) > data;
};

or something like that.

"You can't have arrays of size 0, so 1 is required to make it compile" - GNU C does allow arrays dimensioned as zero. See also http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Zero-Length.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top