Calculating memory space occupied by unordered maps

Question 1

I don't think it's possible to answer your question precisely without looking at the precise unordered_map implementation your STL uses.

However, based on the unordered_map interface, you could make decent educated guesses:

An unordered_map needs to store:

one bucket container (probably a vector-like structure)
max_bucket_count buckets (probably singly-linked-list-like structures)
one complete entry for each item (not only the value, but also the key, to handle key hash collisions)

After a quick look at the Libc++ implementation, you also need space to store:

The hashing function object
The equality test function object
The allocator function object

With this in mind, my guesstimate would be something like:

typedef unordered_map<K, V, ...> tMyMap;

size_t getMemoryUsage(const tMyMap& map) {
  auto entrySize = sizeof(K) + sizeof(V) + sizeof(void*);
  auto bucketSize = sizeof(void*);
  auto adminSize = 3 * sizeof(void*) + sizeof(size_t);

  auto totalSize = adminSize + map.size() * entrySize + map.max_bucket_count() * bucketSize();
  return totalSize;
}

This would only work for the first of your cases, since in the second cases, each entry can have a completely different memory usage based on how big each vector is. For the second case, you would therefore have to add something like this:

size_t getMemoryUsage(const tMyMap& map) {
  auto entrySize = sizeof(K) + sizeof(V) + sizeof(void*);
  auto bucketSize = sizeof(void*);
  auto adminSize = 3 * sizeof(void*) + sizeof(size_t);
  auto totalSize = adminSize + map.size() * entrySize + map.max_bucket_count() * bucketSize();

  auto contentSize = 0;
  for (const auto& kv : map) {
    // since accept is a vector<char>, 
    // it uses capacity() bytes of additional memory
    contentSize += kv.second.accept.capacity();
  }
  totalSize += contentSize;

  return totalSize;
}

However, considering real-world allocation logic, the memory your map actually uses may differ a lot from this due to e.g. external fragmentation. If you want to be 100% sure how much memory an unordered_map uses, you need to also take allocator behavior into account.

Question 2

One possible approach is to find out by experiment using custom allocators.

A straight forward solution could leverage C++11's "Scoped allocator model" (see also The Scoped Allocator Model (Rev 2), Stroustrup: Scoped allocators. A "scoped allocator" will be applied to the container and recursively to its elements.

Providing such a custom allocator for the std::unordered_map and its elements would enable you to precisely count the number of blocks, it's size and the total amount.

By the way, a scoped allocator would also let you optimize the memory consumption and performance, for example by providing an "arena allocator".

There's just one caveat: Despite the Scoped Allocator Model is required by C+11 conformance, it's likely that's it's not implemented in the standard library provided by the Compiler vendor. For example, clang's standard library fully implements Scoped allocators for all containers. Regarding GCC, please read here: GCC Status, and the GCC mailing list.

Edit:

A completely different approach is to use a tool which analyses the memory consumption. For example, the Xcode IDE provides an excellent tool "Instruments", which provides various analysis methods, one for tracking heap and object allocations. It would take just seconds to compare your two different versions. The available tools will differ for your platform, of course ;)

Question 3

I would suggest to create an array containing your maps. You will be able to create lots of them (e.g. 10 thousands) You can then divide the total memory (as indicated by the OS) by 10 thousand. Of couse you would need to take the overhead (memory consumption before allocating your maps) into account (including the array).

Doing lots of sizeof would not help, because for each allocation on the heap, you have an implementation defined overhead. For example on Visual Studio 2013, in Release allocations are done by multiple of 8 bytes (or 16 bytes for large arrays).