سؤال

While I was using Parallel.ForEach in my program, I found that some threads never seemed to finish. In fact, it kept spawning new threads over and over, a behaviour that I wasn't expecting and definitely don't want.

I was able to reproduce this behaviour with the following code which, just like my 'real' program, both uses processor and memory a lot (.NET 4.0 code):

public class Node
{
    public Node Previous { get; private set; }

    public Node(Node previous)
    {
        Previous = previous;
    }
}

public class Program
{
    public static void Main(string[] args)
    {
        DateTime startMoment = DateTime.Now;
        int concurrentThreads = 0;

        var jobs = Enumerable.Range(0, 2000);
        Parallel.ForEach(jobs, delegate(int jobNr)
        {
            Interlocked.Increment(ref concurrentThreads);

            int heavyness = jobNr % 9;

            //Give the processor and the garbage collector something to do...
            List<Node> nodes = new List<Node>();
            Node current = null;
            for (int y = 0; y < 1024 * 1024 * heavyness; y++)
            {
                current = new Node(current);
                nodes.Add(current);
            }

            TimeSpan elapsed = DateTime.Now - startMoment;
            int threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
            Console.WriteLine("[{0:mm\\:ss}] Job {1,4} complete. {2} threads remaining.", elapsed, jobNr, threadsRemaining);
        });
    }
}

When run on my quad-core, it initially starts of with 4 concurrent threads, just as you would expect. However, over time more and more threads are being created. Eventually, this program then throws an OutOfMemoryException:

[00:00] Job    0 complete. 3 threads remaining.
[00:01] Job    1 complete. 4 threads remaining.
[00:01] Job    2 complete. 4 threads remaining.
[00:02] Job    3 complete. 4 threads remaining.
[00:05] Job    9 complete. 5 threads remaining.
[00:05] Job    4 complete. 5 threads remaining.
[00:05] Job    5 complete. 5 threads remaining.
[00:05] Job   10 complete. 5 threads remaining.
[00:08] Job   11 complete. 5 threads remaining.
[00:08] Job    6 complete. 5 threads remaining.
...
[00:55] Job   67 complete. 7 threads remaining.
[00:56] Job   81 complete. 8 threads remaining.
...
[01:54] Job  107 complete. 11 threads remaining.
[02:00] Job  121 complete. 12 threads remaining.
..
[02:55] Job  115 complete. 19 threads remaining.
[03:02] Job  166 complete. 21 threads remaining.
...
[03:41] Job  113 complete. 28 threads remaining.
<OutOfMemoryException>

The memory usage graph for the experiment above is as follows:

Processor and memory usage

(The screenshot is in Dutch; the top part represents processor usage, the bottom part memory usage.) As you can see, it looks like a new thread is being spawned almost every time the garbage collector gets in the way (as can be seen in the dips of memory usage).

Can anyone explain why this is happening, and what I can do about it? I just want .NET to stop spawning new threads, and finish the existing threads first...

هل كانت مفيدة؟

المحلول

You can limit the maximum number of threads that get created by specifying a ParallelOptions instance with the MaxDegreeOfParallelism property set:

var jobs = Enumerable.Range(0, 2000);
ParallelOptions po = new ParallelOptions
{ 
    MaxDegreeOfParallelism = Environment.ProcessorCount
};

Parallel.ForEach(jobs, po, jobNr =>
{
    // ...
});

As to why you're getting the behaviour you're observing: The TPL (which underlies PLINQ) is, by default, at liberty to guess the optimal number of threads to use. Whenever a parallel task blocks, the task scheduler may create a new thread in order to maintain progress. In your case, the blocking might be happening implicitly; for example, through the Console.WriteLine call, or (as you observed) during garbage collection.

From Concurrency Levels Tuning with Task Parallel Library (How Many Threads to Use?):

Since the TPL default policy is to use one thread per processor, we can conclude that TPL initially assumes that the workload of a task is ~100% working and 0% waiting, and if the initial assumption fails and the task enters a waiting state (i.e. starts blocking) - TPL with take the liberty to add threads as appropriate.

نصائح أخرى

You should probably read a bit about the how the task scheduler works.

http://msdn.microsoft.com/en-us/library/ff963549.aspx (latter half of the page)

"The .NET thread pool automatically manages the number of worker threads in the pool. It adds and removes threads according to built-in heuristics. The .NET thread pool has two main mechanisms for injecting threads: a starvation-avoidance mechanism that adds worker threads if it sees no progress being made on queued items and a hill-climbing heuristic that tries to maximize throughput while using as few threads as possible.

The goal of starvation avoidance is to prevent deadlock. This kind of deadlock can occur when a worker thread waits for a synchronization event that can only be satisfied by a work item that is still pending in the thread pool's global or local queues. If there were a fixed number of worker threads, and all of those threads were similarly blocked, the system would be unable to ever make further progress. Adding a new worker thread resolves the problem.

A goal of the hill-climbing heuristic is to improve the utilization of cores when threads are blocked by I/O or other wait conditions that stall the processor. By default, the managed thread pool has one worker thread per core. If one of these worker threads becomes blocked, there's a chance that a core might be underutilized, depending on the computer's overall workload. The thread injection logic doesn't distinguish between a thread that's blocked and a thread that's performing a lengthy, processor-intensive operation. Therefore, whenever the thread pool's global or local queues contain pending work items, active work items that take a long time to run (more than a half second) can trigger the creation of new thread pool worker threads."

You can mark a task as LongRunning but this has the side effect of allocating a thread for it from outside the thread pool which means that the task cannot be inlined.

Remember that the ParallelFor treats the work it is given as blocks so even if the work in one loop is fairly small the overall work done by the task invoked by the look may appear longer to the scheduler.

Most calls to the GC in and of them selves aren't blocking (it runs on a separate thread) but if you wait for GC to complete then this does block. Remember also that the GC is rearranging memory so this may have some side effects (and blocking) if you are trying to allocate memory while running GC. I don't have specifics here but I know the PPL has some memory allocation features specifically for concurrent memory management for this reason.

Looking at your code's output it seems that things are running for many seconds. So I'm not surprised that you are seeing thread injection. However I seem to remember that the default thread pool size is roughly 30 threads (probably depending on the number of cores on your system). A thread takes up roughly a MB of memory before your code allocates any more so I'm not clear why you could get an out of memory exception here.

I've posted the follow-up question "How to count the amount of concurrent threads in .NET application?"

If to count the threads directly, their number in Parallel.For() mostly ((very rarely and insignificantly decreasing) only increases and is not releleased after loop completion.

Checked this in both Release and Debug mode, with

ParallelOptions po = new ParallelOptions
{
  MaxDegreeOfParallelism = Environment.ProcessorCount
};

and without

The digits vary but conclusions are the same.

Here is the ready code I was using, if someone wants to play with:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;

namespace Edit4Posting
{
public class Node
{

  public Node Previous { get; private set; }
  public Node(Node previous)
  {
    Previous = previous;
    }
  }
  public class Edit4Posting
  {

    public static void Main(string[] args)
    {
      int concurrentThreads = 0;
      int directThreadsCount = 0;
      int diagThreadCount = 0;

      var jobs = Enumerable.Range(0, 160);
      ParallelOptions po = new ParallelOptions
      {
        MaxDegreeOfParallelism = Environment.ProcessorCount
      };
      Parallel.ForEach(jobs, po, delegate(int jobNr)
      //Parallel.ForEach(jobs, delegate(int jobNr)
      {
        int threadsRemaining = Interlocked.Increment(ref concurrentThreads);

        int heavyness = jobNr % 9;

        //Give the processor and the garbage collector something to do...
        List<Node> nodes = new List<Node>();
        Node current = null;
        //for (int y = 0; y < 1024 * 1024 * heavyness; y++)
        for (int y = 0; y < 1024 * 24 * heavyness; y++)
        {
          current = new Node(current);
          nodes.Add(current);
        }
        //*******************************
        directThreadsCount = Process.GetCurrentProcess().Threads.Count;
        //*******************************
        threadsRemaining = Interlocked.Decrement(ref concurrentThreads);
        Console.WriteLine("[Job {0} complete. {1} threads remaining but directThreadsCount == {2}",
          jobNr, threadsRemaining, directThreadsCount);
      });
      Console.WriteLine("FINISHED");
      Console.ReadLine();
    }
  }
}
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top