Question

I have an other active question HERE regarding some hopeless memory issues that possibly involve LOH Fragmentation among possibly other unknowns.

What my question now is, what is the accepted way of doing things? If my app needs to be done in Visual C#, and needs to deal with large arrays to the tune of int[4000000], how can I not be doomed by the garbage collector's refusal to deal with the LOH?

It would seem that I am forced to make any large arrays global, and never use the word "new" around any of them. So, I'm left with ungraceful global arrays with "maxindex" variables instead of neatly sized arrays that get passed around by functions.

I've always been told that this was bad practice. What alternative is there?

Is there some kind of function to the tune of System.GC.CollectLOH("Seriously") ? Are there possibly some way to outsource garbage collection to something other than System.GC?

Anyway, what are the generally accepted rules for dealing with large (>85Kb) variables?

Was it helpful?

Solution

Firstly, the garbage collector does collect the LOH, so do not be immediately scared by its prescence. The LOH gets collected when generation 2 gets collected.

The difference is that the LOH does not get compacted, which means that if you have an object in there that has a long lifetime then you will effectively be splitting the LOH into two sections — the area before and the area after this object. If this behaviour continues to happen then you could end up with the situation where the space between long-lived objects is not sufficiently large for subsequent assignments and .NET has to allocate more and more memory in order to place your large objects, i.e. the LOH gets fragmented.

Now, having said that, the LOH can shrink in size if the area at its end is completely free of live objects, so the only problem is if you leave objects in there for a long time (e.g. the duration of the application).

Starting from .NET 4.5.1, LOH could be compacted, see GCSettings.LargeObjectHeapCompactionMode property.

Strategies to avoid LOH fragmentation are:

  • Avoid creating large objects that hang around. Basically this just means large arrays, or objects which wrap large arrays (such as the MemoryStream which wraps a byte array), as nothing else is that big (components of complex objects are stored separately on the heap so are rarely very big). Also watch out for large dictionaries and lists as these use an array internally.
  • Watch out for double arrays — the threshold for these going into the LOH is much, much smaller — I can't remember the exact figure but its only a few thousand.
  • If you need a MemoryStream, considering making a chunked version that backs onto a number of smaller arrays rather than one huge array. You could also make custom version of the IList and IDictionary which using chunking to avoid stuff ending up in the LOH in the first place.
  • Avoid very long Remoting calls, as Remoting makes heavy use of MemoryStreams which can fragment the LOH during the length of the call.
  • Watch out for string interning — for some reason these are stored as pages on the LOH and can cause serious fragmentation if your application continues to encounter new strings to intern, i.e. avoid using string.Intern unless the set of strings is known to be finite and the full set is encountered early on in the application's life. (See my earlier question.)
  • Use Son of Strike to see what exactly is using the LOH memory. Again see this question for details on how to do this.
  • Consider pooling large arrays.

Edit: the LOH threshold for double arrays appears to be 8k.

OTHER TIPS

It's an old question, but I figure it doesn't hurt to update answers with changes introduced in .NET. It is now possible to defragment the Large Object Heap. Clearly the first choice should be to make sure the best design choices were made, but it is nice to have this option now.

https://msdn.microsoft.com/en-us/library/xe0c2357(v=vs.110).aspx

"Starting with the .NET Framework 4.5.1, you can compact the large object heap (LOH) by setting the GCSettings.LargeObjectHeapCompactionMode property to GCLargeObjectHeapCompactionMode.CompactOnce before calling the Collect method, as the following example illustrates."

GCSettings can be found in the System.Runtime namespace

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect(); 

The first thing that comes to mind is to split the array up into smaller ones, so they don't reach the memory needed for the GC to put in it the LOH. You could spit the arrays into smaller ones of say 10,000, and build an object which would know which array to look in based on the indexer you pass.

Now I haven't seen the code, but I would also question why you need an array that large. I would potentially look at refactoring the code so all of that information doesn't need to be stored in memory at once.

You get it wrong. You do NOT need to havean array size 4000000 and you definitely do not need to call the garbace collector.

  • Write your own IList implementation. Like "PagedList"
  • Store items in arrays of 65536 elements.
  • Create an array of arrays to hold the pages.

This allows you to access basically all your elements with ONE redirection only. And, as the individual arrays are smaller, fragmentation is not an issue...

...if it is... then REUSE pages. Dont throw them away on dispose, put them on a static "PageList" and pull them from there first. All this can be transparently done within your class.

The really good thing is that this List is pretty dynamic in the memory usage. You may want to resize the holder array (the redirector). Even when not, it is about 512kbdata per page only.

Second level arrays have basically 64k per byte - which is 8 byte for a class (512kb per page, 256kb on 32 bit), or 64kb per struct byte.

Technically:

Turn int[] into int[][]

Decide whether 32 or 64 bit is better as you want ;) Both ahve advantages and disadvantages.

Dealing with ONE large array like that is unwieldely in any langauge - if you ahve to, then... basically.... allocate at program start and never recreate. Only solution.

Am adding an elaboration to the answer above, in terms of how the issue can arise. Fragmentation of the LOH is not only dependent on the objects being long lived, but if you've got the situation that there are multiple threads and each of them are creating big lists going onto the LOH then you could have the situation that the first thread needs to grow its List but the next contiguous bit of memory is already taken up by a List from a second thread, hence the runtime will allocate new memory for the first threads List - leaving behind a rather big hole. This is whats happening currently on one project I've inherited and so even though the LOH is approx 4.5 MB, the runtime has got a total of 117MB free memory but the largest free memory segment is 28MB.

Another way this could happen without multiple threads, is if you've got more than one list being added to in some kind of loop and as each expands beyond the memory initially allocated to it then each leapfrogs the other as they grow beyond their allocated spaces.

A useful link is: https://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/

Still looking for a solution for this, one option may be to use some kind of pooled objects and request from the pool when doing the work. If you're dealing with large arrays then another option is to develop a custom collection e.g. a collection of collections, so that you don't have just one huge list but break it up into smaller lists each of which avoid the LOH.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top