C# design for handling a high rate of async network operations that complete with a callback (hundreds per second)

https://softwareengineering.stackexchange.com/questions/418900

18-03-2021
|

Question

I am working with a message broker technology, to which events will be published (following an "event carried state transfer" architecture) for consumption by other applications.

The vendor library uses a pattern where a message is handed to a Session object defined by the vendor's client library code, which sends it over the network to the broker, and the vendor's client code will call an OnSessionEvent method which includes data in the event args indicating whether the message was successfully published to the broker. A message may be initially accepted by the Session object but fail to be published to the broker if, for example, the broker's buffer is full (typically a temporary state of affairs).

It could easily be the case that the original source of events is raising them at the rate of up-to-thousands of messages a second.

To further complicate matters, it may be the case that multiple different event sources may be publishing to the same Session, and so the OnSessionEvent response needs to be routed back to the appropriate publisher.

In any case, my struggle right now is trying to figure out an appropriate pattern to efficiently send messages and handle the callbacks. It would of course be less than ideal to send a single message and then wait for the callback result before sending the next message, since the network response may take several milliseconds.

I could create a Task for each message send attempt, collect up a bunch of these tasks, and wait on them as a batch of, say, 100. This is clearly faster than one-message-at-a-time. However it would mean generating hundreds or possibly up to thousands of Tasks every second. Note that the vendor code does not natively expose the network operation as an async operation using a Task. In order to use this pattern, I would add a TaskCompletionSource object to the (local) message object, and SetResult on that TaskCompletionSource when the (local) message object is made available via the callback argument. I am concerned about the rate of object construction that this could cause.

I am hoping for advice or articles which talk about situations like this. I am also curious as to the threading implications of asynchronous callbacks, so a comment or article that covers both would be idea.

Solution

I don't yet have enough information to post a full answer, but I will start and hopefully refine it with input.

I would start out by, first of all, not assuming creating Tasks will be slow. They are designed explicitly to be fast. There is an entire mechanism inside of .NET to make them fast, and the language designers and other domain experts have told us Task is fast.

https://gist.github.com/mgravell/878e7fb19ad2378941f810820b9e90b5:

                                       Method |          Mean |     StdDev |       Op/s | Scaled | Scaled-StdDev |  Gen 0 | Allocated |
--------------------------------------------- |--------------:|-----------:|-----------:|-------:|--------------:|-------:|----------:|
                               int/await/task | 4,808.1011 ns | 34.5216 ns |  207982.32 |   6.70 |          0.05 | 3.4438 |  21.61 kB |
                  int/manual/task/iscompleted | 2,490.7947 ns | 24.4512 ns |  401478.29 |   3.47 |          0.04 | 3.4786 |  21.82 kB |
      int/manual/task/iscompletedsuccessfully | 2,987.2058 ns | 25.0176 ns |     334761 |   4.16 |          0.04 | 3.4781 |  21.82 kB |
                          int/await/valuetask | 5,395.7703 ns | 27.8437 ns |  185330.35 |   7.52 |          0.05 |      - |      0 kB |
             int/manual/valuetask/iscompleted |   702.5910 ns |  4.0544 ns | 1423303.24 |   0.98 |          0.01 |      - |      0 kB |
 int/manual/valuetask/iscompletedsuccessfully |   702.9448 ns |  2.4655 ns | 1422586.73 |   0.98 |          0.01 |      - |      0 kB |
                                     int/sync |   717.4125 ns |  3.0914 ns | 1393898.17 |   1.00 |          0.00 |      - |      0 kB |

https://jeremybytes.blogspot.com/2017/12/how-does-task-in-c-affect-performance.html:

This processed the same number of records: 5000. The "Manhattan Classifier" took 164 seconds to complete, so it was a little bit faster. But the "Null Classifier" took no time at all (well, less than a second). At first, I thought I may have taken out a bit too much code. Maybe the continuations weren't really running. But each record was processed, and we can see that the error count is the same as before: 4515. The error count is incremented in the continuation, so I knew that was still running.

I created a test program on my laptop to start a million tasks and wait for them to no-op finish. It took 4 seconds. So, my benchmark for performance is 250,000 tasks/sec, which is 2-3 orders of magnitude faster than you need so far. So, it could just be a case of "don't worry about it, and code what's easiest". The code is as follows (the 5 second delay for "booting" is because my Visual Studio likes to fling its windows around as it starts up and I wanted it to finish that nonsense):

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("booting");
        Thread.Sleep(5000);
        Console.WriteLine("go");
        var sw = Stopwatch.StartNew();

        var tasks = new List<Task>(
            Enumerable.Range(1, 1000000)
                .Select(i => Task.Run(() =>
                {
                    Task.Yield().GetAwaiter().GetResult();
                })));

        Task.WaitAll(tasks.ToArray());
        Console.WriteLine(sw.ElapsedMilliseconds);
    }
}

booting
go
4726

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange