Question

i implemented a bloom filter(bit table) using three dimension char array it works well until it reaches at a point where it can no more allocate memory and gives a bad_alloc message. It gives me this error on the next expand request after allocating 600MB.

The bloom filter(the array) is expected to grow as big as 8 to 10GB.

Here is the code i used to allocate(expand) the bit table.

unsigned char ***bit_table_=0;
unsigned int ROWS_old=5;
unsigned int EXPND_SIZE=5;


void expand_bit_table()
     {
         FILE *temp;
         temp=fopen("chunk_temp","w+b");
         //copy old content
         for(int i=0;i<ROWS_old;++i)
             for(int j=0;j<ROWS;++j)
                 fwrite(bit_table_[i][j],COLUMNS,1,temp);
         fclose(temp);
         //delete old table
         chunk_delete_bit_table();
         //create expanded bit table ==> add EXP_SIZE more rows
         bit_table_=new unsigned char**[ROWS_old+EXPND_SIZE];
         for(int i=0;i<ROWS_old+EXPND_SIZE;++i)
            {
                bit_table_[i]=new unsigned char*[ROWS];
                for(int k=0;k<ROWS;++k)
                    bit_table_[i][k]=new unsigned char[COLUMNS];
            }
         //copy back old content

          temp=fopen("chunk_temp","r+b");
         for(int i=0;i<ROWS_old;++i)
         {
            fread(bit_table_[i],COLUMNS*ROWS,1,temp);
         }
          fclose(temp);
         //set remaining content of bit_table_to 0
         for(int i=ROWS_old;i<ROWS_old+EXPND_SIZE;++i)
             for(int j=0;j<ROWS;++j)
                 for(int k=0;k<COLUMNS;++k)
                     bit_table_[i][j][k]=0;

         ROWS_old+=EXPND_SIZE;
     }

What is the maximum allowable size for an array and if this is not the issue what can i do about it.

EDIT: It is developed using a 32 bit platform.

It is run on 64 bit platform(server) with 8GB RAM.

Was it helpful?

Solution

A 32-bit program must allocate memory from the virtual memory address space. Which stores chunks of code and data, memory is allocated from the holes between them. Yes, the maximum you can hope for is around 650 megabytes, the largest available hole. That goes rapidly down from there. You can solve it by making your data structure smarter, like a tree or list instead of one giant array.

You can get more insight in the virtual memory map of your process with the SysInternals' VMMap utility. You might be able to change the base address of a DLL so it doesn't sit plumb in the middle of an otherwise empty region of the address space. Odds that you'll get much beyond 650 MB are however poor.

There's a lot more breathing room on a 64-bit operating system, a 32-bit process has a 4 gigabyte address space since the operating system components run in 64-bit mode. You have to use the /LARGEADDRESSAWARE linker option to allow the process to use it all. Still, that only works on a 64-bit OS, your program is still likely to bomb on a 32-bit OS. When you really need that much VM, the simplest approach is to just make a 64-bit OS a prerequisite and build your program targeting x64.

OTHER TIPS

A 32-bit machine gives you a 4GB address space.

The OS reserves some of this (half of it by default on Windows, giving you 2GB to yourself. I'm not sure about Linux, but I believe it reserves 1GB)

This means you have 2-3 GB to your own process.

Into this space, several things need to fit:

  • your executable (as well as all dynamically linked libraries) are memory-mapped into it
  • each thread needs a stack
  • the heap

and quite a few other nitty gritty bits.

The point is that it doesn't really matter how much memory you end up actually using. But a lot of different pieces have to fit into this memory space. And since they're not packed tightly into one end of it, they fragment the memory space. Imagine, for simplicity, that your executable is mapped into the middle of this memory space. That splits your 3GB into two 1.5GB chunks. Now say you load two dynamic libraries, and they subdivide those two chunks into four 750MB ones. Then you have a couple of threads, each needing further chunks of memory, splitting up the remaining areas further. Of course, in reality each of these won't be placed at the exact center of each contiguous block (that'd be a pretty stupid allocation strategy), but nevertheless, all these chunks of memory subdivide the available memory space, cutting it up into many smaller pieces.

You might have 600MB memory free, but you very likely won't have 600MB of contiguous memory available. So where a single 600MB allocation would almost certainly fail, six 100MB allocations may succeed.

There's no fixed limit on how big a chunk of memory you can allocate. The answer is "it depends". It depends on the precise layout of your process' memory space. But on a 32-bit machine, you're unlikely to be able to allocate 500MB or more in a single allocation.

The maximum in-memory data a 32-bit process can access is 4GB in theory (in practice it will be somewhat smaller). So you cannot have 10GB data in memory at once (even with the OS supporting more). Also, even though you are allocating the memory dynamically, the free store available is further limited by the stack size.

The actual memory available to the process depends on the compiler settings that generates the executable.

If you really do need that much, consider persisting (parts of) the data in the file system.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top