Question

Our application continuously allocates arrays for large quantities of data (say tens to hundreds of megabytes) which live for a shortish amount of time before being discarded.

Done naively this can cause large object heap fragmentation, eventually causing the application to crash with an OutOfMemoryException despite the size of the currently live objects not being excessive.

One way we have successfully managed this in the past is to chunk up the arrays to ensure they don't end up on the LOH, the idea being to avoid fragmentation by allowing memory to be compacted by the garbage collector.

Our latest application handles more data than before, and passes this serialized data very frequently between add-ins hosted in either separate AppDomains or separate processes. We adopted the same approach as before, ensuring our memory was always chunked and being very careful to avoid large object heap allocations.

However we have one add-in that must be hosted in an external 32bit process (because our main application is 64bit and the add-in must use a 32bit library). Under particularly heavy load, when a lot of SOH memory chunks are being quickly allocated and discarded shortly after, even our chunking approach hasn't been enough to save our 32bit add-in and it crashes with an OutOfMemoryException.

Using WinDbg at the moment when an OutOfMemoryException occurs, !heapstat -inclUnrooted shows this:

Heap             Gen0         Gen1         Gen2          LOH
Heap0           24612      4166452    228499692      9757136

Free space:                                                 Percentage
Heap0              12           12      4636044        12848SOH:  1% LOH:  0%

Unrooted objects:                                           Percentage
Heap0              72            0         5488            0SOH:  0% LOH:  0%

!dumpheap -stat show this:

-- SNIP --

79b56c28     3085       435356 System.Object[]
79b8ebd4        1      1048592 System.UInt16[]
79b9f9ac    26880      1301812 System.String
002f7a60       34      4648916      Free
79ba4944     6128     87366192 System.Byte[]
79b8ef28    17195    145981324 System.Double[]
Total 97166 objects
Fragmented blocks larger than 0.5 MB:
    Addr     Size      Followed by
18c91000    3.7MB         19042c7c System.Threading.OverlappedData

These tell me that our memory usage isn't excessive, and our large object heap is very small as expected (so we're definitely not dealing with large object heap fragmentation here).

However, !eeheap -gc shows this:

Number of GC Heaps: 1
generation 0 starts at 0x7452b504
generation 1 starts at 0x741321d0
generation 2 starts at 0x01f91000
ephemeral segment allocation context: none
 segment     begin allocated  size
01f90000  01f91000  02c578d0  0xcc68d0(13396176)
3cb10000  3cb11000  3d5228b0  0xa118b0(10557616)
3ece0000  3ece1000  3fc2ef48  0xf4df48(16047944)
3db10000  3db11000  3e8fc8f8  0xdeb8f8(14596344)
42e20000  42e21000  4393e1f8  0xb1d1f8(11653624)
18c90000  18c91000  19c53210  0xfc2210(16523792)
14c90000  14c91000  15c85c78  0xff4c78(16731256)
15c90000  15c91000  168b2870  0xc21870(12720240)
16c90000  16c91000  17690744  0x9ff744(10483524)
5c0c0000  5c0c1000  5d05381c  0xf9281c(16328732)
69c80000  69c81000  6a88bc88  0xc0ac88(12627080)
6b2d0000  6b2d1000  6b83e8a0  0x56d8a0(5691552)
6c2d0000  6c2d1000  6d0f2608  0xe21608(14816776)
6d2d0000  6d2d1000  6defc67c  0xc2b67c(12760700)
6e2d0000  6e2d1000  6ee7f304  0xbae304(12247812)
70000000  70001000  70bfb41c  0xbfa41c(12559388)
71ca0000  71ca1000  72893440  0xbf2440(12526656)
73b40000  73b41000  74531528  0x9f0528(10421544)
Large object heap starts at 0x02f91000
 segment     begin allocated  size
02f90000  02f91000  038df1d0  0x94e1d0(9757136)
Total Size:              Size: 0xe737614 (242447892) bytes.
------------------------------
GC Heap Size:            Size: 0xe737614 (242447892) bytes.

The thing that strikes me here is that our final SOH heap segment starts at 0x73b41000 which is right at the limit of our available memory in our 32bit add-in.

So if I'm reading that correctly, our problem seems to be that is our virtual memory has become fragmented with managed heap segments.

I guess my questions here would be:

  • Is my analysis correct?
  • Is our approach to avoiding LOH fragmentation using chunking reasonable?
  • Is there a good strategy to avoid the memory fragmentation we now appear to be seeing?

The most obvious answer I can think of is to pool and re-use our memory chunks. This is potentially do-able, but is something I would rather avoid as it involves us effectively managing that part of our memory ourselves.

Was it helpful?

Solution

For those interested, here is an update of what I found out with regards to this problem:

It appeared that the best solution was to implement pooling of our chunks to relieve pressure on the garbage collector, so I did this.

The result was that the add-in got slightly further in its task, but unfortunately it still ran out of memory fairly quickly.

Looking in WinDbg again, the only real difference I could see was that our combined managed heap size was consistently smaller, at around 200MB compared to around 250MB before pooling.

It was almost as if the amount of memory available to .NET was decreasing over time, and so implementing the pooling had simply delayed running out of memory.

If this was true the obvious culprit was a COM component which we use to load the data into memory. We do some caching of COM objects to improve repeated access time to the data. I removed all the caching and ensured everything was released after every query of the data.

Now everything looks fine with regards to memory, it is just much slower (which I will have to solve next).

I guess in hindsight the COM component should have been the first suspect for the memory issues, but hey I learned something :) And on the plus side, the pooling will still be useful to decrease GC overhead, so that was worth doing as well.

Thanks for your comments everyone.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top