Updated
This is a complete rewrite. There was an error in the original post/my answer which made me benchmark the same allocator twice. Oops.
Well, I can see huge differences in performance. I have made the following test bed, which takes several precautions to ensure crucial stuff isn't completely optimized out. I then verified (with -O0 -fno-inline) that the allocator's construct
and destruct
calls are getting called the expected number of times (yes):
#include <vector>
#include <cstdlib>
template<typename T>
struct MyAllocator : public std::allocator<T> {
typedef std::allocator<T> Alloc;
//void destroy(Alloc::pointer p) {} // pre-c+11
//void construct(Alloc::pointer p, Alloc::const_reference val) {} // pre-c++11
template< class U > void destroy(U* p) {}
template< class U, class... Args > void construct(U* p, Args&&... args) {}
template<typename U> struct rebind {typedef MyAllocator other;};
};
int main()
{
typedef char T;
#ifdef OWN_ALLOCATOR
std::vector<T, MyAllocator<T> > v;
#else
std::vector<T> v;
#endif
volatile unsigned long long x = 0;
v.reserve(1000000); // or more. Make sure there is always enough allocated memory
for(auto i=0ul; i< 1<<18; i++) {
v.resize(1000000);
x += v[rand()%v.size()];//._x;
v.clear(); // or v.resize(0);
};
}
The timing difference is marked:
g++ -g -O3 -std=c++0x -I ~/custom/boost/ test.cpp -o test
real 0m9.300s
user 0m9.289s
sys 0m0.000s
g++ -g -O3 -std=c++0x -DOWN_ALLOCATOR -I ~/custom/boost/ test.cpp -o test
real 0m0.004s
user 0m0.000s
sys 0m0.000s
I can only assume that what you are seeing is related to the standard library optimizing allocator operations for char
(it being a POD type).
The timings get even farther apart when you use
struct NonTrivial
{
NonTrivial() { _x = 42; }
virtual ~NonTrivial() {}
char _x;
};
typedef NonTrivial T;
In this case, the default allocator takes in excess of 2 minutes (still running). whereas the 'dummy' MyAllocator spends ~0.006s. (Note that this invokes undefined behaviour referencing elements that haven't been properly initialized.)