Question

I am using parallel tasks for the first time instead of using a traditional threadpool. In my application I allow for the user to input the number of tasks started to complete the job. (jobs can be very big). I noticed that if I allow any more than 10 or so tasks, the application starts to hang and I actually get worse performance due to the resources used.

I am wondering if there is any correlation between amount of processors and max amount of tasks, so that I can limit the maximum amount of tasks for the users pc so it doesn't slow it down.

Was it helpful?

Solution 2

The TPL will automatically change how tasks are scheduled and add or remove ThreadPool threads over time. This means that, given enough time and similar work, the default behavior should improve to be the best option.

By default, it will start by using more threads than cores, since many tasks are not "pure CPU". Given that you're seeing extra tasks causing a slowdown, you likely either have resource contention (via locking), or your tasks are CPU bound, and having more tasks than processor cores will cause slowdowns. If this is going to be problematic, you can make a custom TaskScheduler that limits the number of tasks allowed at once, such as the LimitedConcurrencyTaskScheduler. This allows you to limit the number of tasks to the number of processors in pure CPU scenarios.

If your tasks are bound by other factors, such as IO, then you may need to profile to determine the best balance between # of concurrently scheduled tasks and throughput, though this will be system specific.

OTHER TIPS

No, mostly becuase there is no definition of task. A task can be CPU intensive (limit is like Cores * factor), IO intensive (limit can be very low), or network intensivbe oto a limited ressource (which does not like to handle 1000 requests at the same time).

So, it is up for you as a programmer to use your brain and come up with a concept, then validate it and then put it into your program, depending on what the task actually IS and where the bottlenecks are foreseen. Such planning can be complex - very complex - but most of the time it is quite simple.

Assuming that your tasks are CPU-intensive (i.e. they don't do a lot of I/O blocking such as reading files), you probably want to limit the number of parallel tasks to the number of CPU cores available to your application. For example, if your application is running on a computer with a quad-core processor (i.e. 4 cores), limit it to 4 simultaneous tasks.

If your tasks are limited by something other than the CPU (e.g. disk access, network access, etc), then you'll need to figure out what share of that resource each task takes on average. If you know the average then the number of tasks you should run to fully utilize your resource is 100 / average.

I came with this

Generally this is the guideline:

number-of-tasks = (task-total-run-time / task-cpu-bounded-run-time) * number-of-cores

For real calculation:

number-of-tasks = Ceil(Avg(task-total-run-time / task-cpu-bounded-run-time) * Max((number-of-cores - 1), 1))


Explanation:

number-of-tasks - the number of tasks to run in parallel

task-total-run-time - total run time of the async method in milliseconds

task-cpu-bounded-run-time - the time that the async method uses the core(s) cpu in milliseconds

number-of-cores - the number of real or virtual core ( in case of docker container for example ) that are available, there is a subtraction of 1 because one core is for the main thread and not a worker threads, for 1 core environments it will be minimum 1


Notes:

  • How to know the Avg(task-total-run-time / task-cpu-bounded-run-time): A Measures need to be done in a real environment to figure out what is the average time ratio
  • There should be some upper limit for maximum number of tasks even if they are almost fully not cpu bounded, for example 128, this should be also taken into consideration


Example:

For 8 cores, and an average run is like this in terms of time

public async Task UpdateScoreAsync(string requestId)
{
    var res = await GetFromDB(requestId).ConfigureAwait(false); // not cpu bounded, 10 seconds

    int resScore = CalculateScore(res); // cpu bounded, 20 seconds

    await UpdateScoreInDB(requestId, resScore).ConfigureAwait(false); // not cpu bounded, 20 seconds
}

So the number of tasks you can run that are doing this ( if this is the only one you have ) are 12 tasks

12 = Ceil(((10 + 20 + 20) / (10 + 20)) * (8 - 1))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top