Question

Continuing the discussion from Understanding VS2010 C# parallel profiling results but more to the point:

I have many threads that work in parallel (using Parallel.For/Each), which use many memory allocations for small classes.

This creates a contention on the global memory allocator thread.

Is there a way to instruct .NET to preallocate a memory pool for each thread and do all allocations from this pool?

Currently my solution is my own implementation of memory pools (globally allocated arrays of object of type T, which are recycled among the threads) which helps a lot but is not efficient because:

  1. I can't instruct .NET to allocate from a specific memory slice.
  2. I still need to call new many times to allocate the memory for the pools.

Thanks,
Haggai

Was it helpful?

Solution

I searched for two days trying to find an answer to the same issue you had. The answer is you need to set the garbage collection mode to Server mode. By default, garbage collection mode set to Workstation mode. Setting garbage collection to Server mode causes the managed heap to split into separately managed sections, one-per CPU. To do this, you need to add a config setting to your app.config file.

<runtime>
   <gcServer enabled="true"/>
</runtime>

The speed difference on my 12-core Opteron 6172 was dramatic!

OTHER TIPS

The garbage collector does not allocate memory.

It sounds more like you're allocating lots of small temporary objects and a few long-lived objects, and the garbage collector is spending a lot of time garbage-collecting the temporary objects so your app doesn't have to request more memory from the OS. From .NET Framework 4 Advanced Development - Garbage Collection:

As long as address space is available in the managed heap, the runtime continues to allocate space for new objects. However, memory is not infinite. Eventually the garbage collector must perform a collection in order to free some memory.

The solution: Don't allocate lots of small temporary objects. The page on Garbage Collection and Performance might also be helpful.

You could pre-allocate a bunch of objects, and keep them in groups intended for separate threads. However, it's likely that you won't get any better performance from this.

The garbage collector is specially designed to handle small short-lived objects efficiently. If you keep the objects in a pool, they are long-lived and will survive a garbage collection, which in turns means that they will be copied to the second generation heap. This copying will be more expensive than just allocating new objects.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top