Question

Currently, I have a large number of C# computations (method calls) residing in a queue that will be run sequentially. Each computation will use some high-latency service (network, disk...).

I was going to use Mono coroutines to allow the next computation in the computation queue to continue while a previous computation is waiting for the high latency service to return. However, I prefer to not depend on Mono coroutines.

Is there a design pattern that's implementable in pure C# that will enable me to process additional computations while waiting for high latency services to return?

Thanks

Update:

I need to execute a huge number (>10000) of tasks, and each task will be using some high-latency service. On Windows, you can't create that much threads.

Update:

Basically, I need a design pattern that emulates the advantages (as follows) of tasklets in Stackless Python (http://www.stackless.com/)

  1. Huge # of tasks
  2. If a task blocks the next task in the queue executes
  3. No wasted cpu cycle
  4. Minimal overhead switching between tasks
Was it helpful?

Solution

You can simulate cooperative microthreading using IEnumerable. Unfortunately this won't work with blocking APIs, so you need to find APIs that you can poll, or which have callbacks that you can use for signalling.

Consider a method

IEnumerable Thread ()
{
    //do some stuff
    Foo ();

    //co-operatively yield
    yield null;

    //do some more stuff
    Bar ();

    //sleep 2 seconds
    yield new TimeSpan (2000);
}

The C# compiler will unwrap this into a state machine - but the appearance is that of a co-operative microthread.

The pattern is quite straightforward. You implement a "scheduler" that keeps a list of all the active IEnumerators. As it cycles through the list, it "runs" each one using MoveNext (). If the value of MoveNext is false, the thread has ended, and the scheduler removes it from the list. If it's true, then the scheduler accesses the Current property to determine the current state of the thread. If it's a TimeSpan, the thread wishes to sleep, and the scheduler moved it onto some queue that can be flushed back into the main list when the sleep timespans have ended.

You can use other return objects to implement other signalling mechanisms. For example, define some kind of WaitHandle. If the thread yields one of these, it can be moved to a waiting queue until the handle is signalled. Or you could support WaitAll by yielding an array of wait handles. You could even implement priorities.

I did a simple implementation of this scheduler in about 150LOC but I haven't got round to blogging the code yet. It was for our PhyreSharp PhyreEngine wrapper (which won't be public), where it seems to work pretty well for controlling a couple of hundred characters in one of our demos. We borrowed the concept from the Unity3D engine -- they have some online docs that explain it from a user point of view.

OTHER TIPS

I'd recommend using the Thread Pool to execute multiple tasks from your queue at once in manageable batches using a list of active tasks that feeds off of the task queue.

In this scenario your main worker thread would initially pop N tasks from the queue into the active tasks list to be dispatched to the thread pool (most likely using QueueUserWorkItem), where N represents a manageable amount that won't overload the thread pool, bog your app down with thread scheduling and synchronization costs, or suck up available memory due to the combined I/O memory overhead of each task.

Whenever a task signals completion to the worker thread, you can remove it from the active tasks list and add the next one from your task queue to be executed.

This will allow you to have a rolling set of N tasks from your queue. You can manipulate N to affect the performance characteristics and find what is best in your particular circumstances.

Since you are ultimately bottlenecked by hardware operations (disk I/O and network I/O, CPU) I imagine smaller is better. Two thread pool tasks working on disk I/O most likely won't execute faster than one.

You could also implement flexibility in the size and contents of the active task list by restricting it to a set number of particular type of task. For example if you are running on a machine with 4 cores, you might find that the highest performing configuration is four CPU-bound tasks running concurrently along with one disk-bound task and a network task.

If you already have one task classified as a disk IO task, you may choose to wait until it is complete before adding another disk IO task, and you may choose to schedule a CPU-bound or network-bound task in the meanwhile.

Hope this makes sense!

PS: Do you have any dependancies on the order of tasks?

You should definitely check out the Concurrency and Coordination Runtime. One of their samples describes exactly what you're talking about: you call out to long-latency services, and the CCR efficiently allows some other task to run while you wait. It can handle huge number of tasks because it doesn't need to spawn a thread for each one, though it will use all your cores if you ask it to.

Isn't this a conventional use of multi-threaded processing?

Have a look at patterns such as Reactor here

Writing it to use Async IO might be sufficient.

This can lead to nasy, hard to debug code without strong structure in the design.

You should take a look at this:

http://www.replicator.org/node/80

This should do exactly what you want. It is a hack, though.

Some more information about the "Reactive" pattern (as mentioned by another poster) with respect to an implementation in .NET; aka "Linq to Events"

http://themechanicalbride.blogspot.com/2009/07/introducing-rx-linq-to-events.html

-Oisin

In fact, if you use one thread for a task, you will lose the game. Think about why Node.js can support huge number of conections. Using a few number of thread with async IO!!! Async and await functions can help on this.

foreach (var task in tasks)
{
    await SendAsync(task.value);
    ReadAsync(); 
}

SendAsync() and ReadAsync() are faked functions to async IO call.

Task parallelism is also a good choose. But I am not sure which one is faster. You can test both of them in your case.

Yes of course you can. You just need to build a dispatcher mechanism that will call back on a lambda that you provide and goes into a queue. All the code I write in unity uses this approach and I never use coroutines. I wrap methods that use coroutines such as WWW stuff to just get rid of it. In theory, coroutines can be faster because there is less overhead. Practically they introduce new syntax to a language to do a fairly trivial task and furthermore you can't follow the stack trace properly on an error in a co-routine because all you'll see is ->Next. You'll have to then implement the ability to run the tasks in the queue on another thread. However, there is parallel functions in the latest .net and you'd be essentially writing similar functionality. It wouldn't be many lines of code really.

If anyone is interested I would send the code, don't have it on me.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top