Question

Background

I have implemented the consumer-producer pattern supporting multiple consumer threads, and multiple producers.

The consumers wait on a pulse to compete for a job:

private void DistributedConsume()
{

    while (_continueConsuming)
    {
        try
        {
            Monitor.Wait(_workerLockObject);
            IModelJob job = _jobProvider.GetNextJob();

            if (job != null)
            {
                //we found and dequeued job for processing
                if (job.JobType != JobType.TerminateWorker)
                {
                    job.PrioritizationStatus = PrioritizationStatus.Dequeued;

                    try
                    {
                        job.Execute();
                    }
                    catch (ThreadAbortException)
                    {
                        LoggerFacade.Log(LogCategory.Debug, "WorkerController.DistributedConsume(): Thread {0} has been aborted", Thread.CurrentThread.Name);
                        throw;
                    }
                    catch (Exception ex)
                    {
                        LoggerFacade.Log(LogCategory.Debug, "WorkerController.DistributedConsume(): Unhandled exception on thread {0}", Thread.CurrentThread.Name);
                    }
                }
                else
                {
                    _continueConsuming = false;
                }
            }
        }
        catch (ThreadAbortException)
        {
            LoggerFacade.Log(LogCategory.Debug, "WorkerController.DistributedConsume(): Thread {0} has been aborted", Thread.CurrentThread.Name);
            throw;
        }
        catch (Exception ex)
        {
            ExceptionHandler.ProcessException(ex, "Unexpected Exception occured attempting to fetch the next job");
        }
    }

    LoggerFacade.Log(LogCategory.OperationalInfo, "WorkerController.DistributedConsume(): Thread {0} is exiting", Thread.CurrentThread.Name);

}

The executing job can consume quite a large amount of memory, but as you can see the job is declared with inner scope, so I expect the job to go out of scope on each loop, and thus it and all its referenced data will qualify for garbage collection.

The Problem

A job executes, consumes large amounts of memory, the job completes, and the consumer thread loops back to the Monitor.Wait and waits on a pulse. By this stage the job is out of scope and no longer referenced. What is problematic is that if no new job is queued and the thread is not pulsed the memory usage stays very high - no matter how long I wait. Using WinDbg I can also see that the job on the heap along with all descendant objects it references.

This might make you think I have a memory leak, but WinDbg also verifies that the job object is not rooted. But as soon as we submit another job to the system which this same consumer thread picks up, the memory is released and objects are cleaned up. But if other consumers pick up this new job it appears the memory is not released.

I can't make sense of this. None of my paranoid theories of what might be happening match up with my understanding of how the G.C works.

Theory 1. The job has made it to gen 1 or 2, and so isn't going to be collected unless the system is starved of memory. But then it doesn't make sense that it would collect it as soon as the thread is woken and begins executing another job as memory is far from being low.

Theory 2. I don't have a second theory worth sharing.

When the system is busy with many automated jobs a consumer won't sit waiting for too long so the problem isn't visible. But on quiet days where only users manually submit a few large jobs questions are being raised about why the service is sitting idle with such a high memory usage. Because of this it is not a critical issue but it is perplexing.

Any pointers?

Thanks

Was it helpful?

Solution

The critical thing to remember when with GC, and when you're monitoring memory usage of any managed application, is this:

Your object's memory is not garbage-collected simply because they're out of scope.

Put another way: if you have no more references to an object, that does not mean that the object has been collected. It means that it is eligible for collection and will be collected the next time the GC runs. You can't guarantee when that will be, though.

In a .NET application, GC will run automatically as part of a memory allocation process: that is, if you try to create a new object and there is no room in Gen0 for that object, a GC will run in order to free up the required space. If you aren't allocating, no collection happens. (There are some rare exceptions; for example: minimising a WinForms app would GC and release unreferenced memory, though I haven't tested that behaviour since probably .NET 2.)

In your case (and assuming that your code is correctly-written and doesn't hold any references to relevant objects) this is therefore the expected behaviour: your code is waiting on the Pulse and will not make any allocations, therefore the GC is under no pressure and is unlikely to run a collection. When you start your next job you start allocating memory, and so a GC follows soon after and collects what is no longer referenced from the previous job.

So I would say "Theory 2" is: the GC is doing exactly what it is expected to, and there's nothing to worry about.

OTHER TIPS

Are you using Task Manager to measure memory usage?
If so test with with an real memory tool like ANTS.
Task Manager will tend to overstate memory usage.

Job out of scope?
This sure looks like a reference to me:

IModelJob job = _jobProvider.GetNextJob();

These two statements are false:
"inner scope, so I expect the job to go out of scope on each loop"
"By this stage the job is out of scope and no longer referenced."
In the current format job only falls out of scope when you exit (not loop) the while (_continueConsuming)
The prior IModelJob job would only fall out of scope when you hit that line the next time
Which is exactly the behavior you are seeing

Put IModelJob job in a using block
This way it will fall out of scope IN the loop

using (IModelJob job = _jobProvider.GetNextJob()) 
{

}

Please try the following
But don't leave the GC collect in production.

if (job != null)
{
...
}
GC.Collect();


using (IModelJob job = _jobProvider.GetNextJob()) 
{

}
GC.Collect();

Old link:

My object is not rooted, why wasn't it garbage collected?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top