Question

As far as I know, there are three reasons why a std::bad_alloc can be thrown:

  1. The process requests more memory than what can be served
  2. The address space is too fragmented to serve a request for a large chunk of contiguous memory
  3. The heap management datastructure is corrupted

We have code which runs into a std::bad_alloc, but none of the above reasons seem to apply. The datastructure is a graph stored as a std::list of vertices, where each vertex stores again a std::list of the edges of which it is part of as well as some amount of contiguous data.

For small graphs (<= 100'000 vertices), the program runs perfectly fine, irrespective of how large the data sections per vertex are (we can without problems allocate up to 40 GB in total). If the number of vertices grows larger, however, we get a std::bad_alloc exception thrown even on instances using "only" 8 GB of memory.

Since there are no problems when allocating more memory in larger blocks, above reasons 1. and 2. should be ruled out. There are sections where we play around with pointers in a quite error prone way, so it is possible that we might corrupt the heap datastructure. But when run on smaller instances, valgrind's memcheck reports our code as flawless, so that reason seems unlikely as well (on throwing instances, valgrind itself runs out of memory, so we cannot check that case directly).

Are there any ideas on what else could be the reason for this behaviour, or what tests we might run to further pin down the problem?

OS: Fedora 19
Build system: cmake with gcc 4.8.2

Was it helpful?

Solution

I cannot comment on your post, so I'll put it in reply.

I came across the same problem while using OpenFST with Kaldi (same system and gcc as yours). I didn't track the exact origin of this issue, but it seems, that kernel 3.12 is the issue here. I booted with one of the backup kernels (one of 3.11 series) and the problem was gone.

You can use:

yum list --showduplicates kernel

to find available 3.11 kernel.

EDIT:

It seems, that this bug is fixed in Kernel 3.12.11-201 and in 3.13+

Source: Bugzilla

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top