Memory allocation while insertion into a map

https://stackoverflow.com/questions/2408179

18-09-2019
|

Question

#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <vector>
#include <string>
#include <iostream>
#include <map>
#include <utility>
#include <algorithm>

void * GetMemory(size_t n) {
  void *ptr = malloc(n);
  printf("getMem n %d   ptr 0x%x\n", n, reinterpret_cast<unsigned int> (ptr));
  return ptr;
}

void FreeMemory(void *p) {
  free(p);
}

void* operator new (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void* operator new [] (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void operator delete (void *p) {
  FreeMemory(p);
}

void operator delete [] (void *p) {
  FreeMemory(p);
}

typedef std::vector<int> vec;

int main(int argc, char *argv[]) {
  std::map<int, vec> z;
  vec x;
  z.insert(std::pair<int,vec>(1,x));
}

Compile with g++ -Wall -ansi test.cpp -o test

Run test.

Why are there three calls to GetMemory with n = 0?

Solution

Stick some tracing in FreeMemory and change main to this:

int main(int argc, char *argv[]) {
  printf("map\n");
  std::map<int, vec> z;
  printf("vec\n");
  vec x;
  printf("pair\n");
  std::pair<int,vec> y(1,x);
  printf("insert\n");
  z.insert(y);
  printf("inserted 1\n");
  y.first = 2;
  printf("insert\n");
  z.insert(y);
  printf("inserted 2\n");

}

Output:

$ make mapinsert CXXFLAGS=-O3 -B && ./mapinsert
g++ -O3    mapinsert.cpp   -o mapinsert
map
vec
pair
getMem n 0   ptr 0x6b0258
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b0278
getMem n 0   ptr 0x6b02a0
FreeMemory ptr 0x6b0268
inserted 1
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b02b0
getMem n 0   ptr 0x6b02d8
FreeMemory ptr 0x6b0268
inserted 2
FreeMemory ptr 0x6b0258
FreeMemory ptr 0x6b02d8
FreeMemory ptr 0x6b02b0
FreeMemory ptr 0x6b02a0
FreeMemory ptr 0x6b0278

So of your 3 0-sized allocations:

One is to copy the empty vector into the pair.
One is to store a copy of the empty vector in the map.

These two are clearly necessary. What I'm not sure about is this:

One is to copy the vector somewhere in the call to insert, and this is also freed in the call to insert.

It's as if insert (or something it calls internally) is taking its parameter by value instead of by reference, or insert is explicitly taking a copy into an automatic variable some time before it allocates the new map node. Firing up a debugger is effort for me at the moment, I'll leave it to someone else.

Edit: mystery solved. insert takes a std::pair<const int, vec>, not a std::pair<int, vec>. The extra copy of an empty vector is because the pair you construct has to be converted to a(nother) temporary, then a reference to that temporary is passed to insert. std::pair has a constructor template that lets you get away with almost anything. 20.2.2/4:

template<class U, class V> pair(const pair<U,V> &p);

Effects: initializes members from the corresponding members of the argument, performing implicit conversions as needed.

I also observe that in my implementation, vec x; doesn't call getMem, but vec x(0); does. So actually:

z[1] = vec();

Is less code and denies you the opportunity to make the extra copy (although it calls operator= instead). It does still make 2 0-sized allocations, at least for me.

The C++ standard defines operator[] to return the result of a specified expression involving a call to insert. I'm not certain whether this means the effects of operator[] are "as if" make_pair and insert were called (that is, the standard is as good as specifying what the source must be for operator[]), or just that the value returned is the same value as the specified expression would yield. If the latter then perhaps an implementation could do it with a single 0-sized allocation. But certainly map has no guaranteed way to create an entry without first creating a pair that contains the mapped type, so 2 allocations should be expected. Or more properly, 2 copies of the desired mapped value: the fact that copying a 0-sized vector makes a 0-sized allocation is implementation-dependent.

So, if you had a case where the value was really expensive to copy, but really cheap to default-construct (like a container with lots of elements), then the following might be useful:

std::map<int, vec> z;
vec x(1000);
z[1] = x;
// i.e. (*(z.insert(std::pair<const int, vec>(1,vec())).first)).second = x;

makes 2 allocations of size 4000 and 2 of size 0, whereas:

std::map<int, vec> z;
vec x(1000);
z.insert(std::pair<const int, vec>(2, x));

makes 3 of size 4000 and none of size 0. Eventually the size is big enough that the extra allocation in the first code is cheaper than the extra copying in the second code.

It's possible that move-constructors in C++0x will help with this, I'm not sure.

OTHER TIPS

All 3 cases concerned with initialization of empty vector:

to init root element of tree (internal implementation of std::map) that would contain empty vector.
your own initialization of 'vec x'.
copy constructor for std::pair for element 'second' that invokes copying empty set of variable 'x'

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow