How to read from multiple queues in real-world?

https://stackoverflow.com/questions/10964933

13-06-2021
|

Question

Here's a theoretical question:

When I'm building an application using message queueing, I'm going to need multiple queues support different data types for different purposes. Let's assume I have 20 queues (e.g. one to create new users, one to process new orders, one to edit user settings, etc.).

I'm going to deploy this to Windows Azure using the 'minimum' of 1 web role and 1 worker role.

How does one read from all those 20 queues in a proper way? This is what I had in mind, but I have little or no real-world practical experience with this:

Create a class that spawns 20 threads in the worker role 'main' class. Let each of these threads execute a method to poll a different queue, and let all those threads sleep between each poll (of course with a back-off mechanism that increases the sleep time).

This leads to have 20 threads (or 21?), and 20 queues that are being actively polled, resulting in a lot of wasted messages (each time you poll an empty queue it's being billed as a message).

How do you solve this problem?

Solution

I read the other answers (very good answers) and wanted to put my own spin on this.

Sticking with Windows Azure Queues, as @Lucifure was describing: I really don't see the need for multiple queues except for two scenarios:

You want different priorities. The last thing you want is a high-priority message getting stuck behind hundreds of low-priority messages. Create a hi-pri queue for these.
The number of message reads+deletes is going to exceed the target of 500 transactions per second. In this case, create multiple queues, to spread the transaction volume across storage partitions (and a storage account will handle upwards of 5K transactions per second).

If you stick with a single queue (storage-based, not service bus), you can read blocks of messages at one time (up to 32). You can easily work up a format that helps you differentiate message type (maybe with a simple prefix). Then, just hand off the message to an appropriate thread for processing. Service Bus queues don't have multi-message reads, although they do allow for prefetch (which results in buffered messages being downloaded into a cache).

An advantage of one queue over many: you remove (or greatly reduce) the problem of "many queues having no messages, resulting in empty reads."

If you need more throughput, you can always crank up the number of threads doing the queue-reading and dispatching.

Remember that each delete is atomic; no batching. And as far as queue-polling goes: you're right to think about backoff. You don't need to back off after successfully reading a message (or chunk of messages). Just back off when you don't get anything after an attempt to read.

One nice advantage over Service Bus queues: Windows Azure queues provide you with an approximate message count (which is really helpful when considering scale-out to multiple instances). Service Bus queues don't provide this.

OTHER TIPS

An alternate strategy would be to use a single or less queues such that a queue could support more that one type of message. This approach is easier to manage and cheaper if you system architecture can support it.

In the real world I have successfully used multiple queues (for scalability purposes) each queue read on a separate thread triggered by a timer event. Depending on the load on the queue and the application needs, the timer event was changed to service the queue at dynamically changing intervals.

If a back-off mechanism on storage queues isn't sufficient for you I suggest you consider Service Bus Queues. With Service Bus Queues you won't have to do such aggressive polling.

You would still need to implement a loop for polling the queue, but the receive timeout makes it lighter than a constantly polling mechanism you'd have when using storage queues.

In the following example I try to receive a message from the queue. If no message is found it will keep the connection open for 30 seconds to see if anything new comes in. If no message arrived after 30 sec, the Receive method will return null (and I would have a loop trying to call Receive again). Note that the maximum timeout is 24 days.

MessagingFactory factory = MessagingFactory.Create(ServiceBusEnvironment.CreateServiceUri("sb", ServiceNamespace, string.Empty), credentials); 
QueueClient myQueueClient = factory.CreateQueueClient("TestQueue");
myQueueClient.Receive(new TimeSpan(hours: 0, minutes: 0, seconds: 30));

Popping up threads for each queue you want to read from is a good idea, but seen the capacity limitations of the CLR thread pool you should also consider receiving messages asynchronously (using TaskFactory.FromAsync for example): http://msdn.microsoft.com/en-us/library/windowsazure/hh851744.aspx

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow