Question

I am trying to figure out how an AMQP broker, such as RabbitMQ, can be integrated in our architecture. This should allow to easily scale horizontally in the future.

For the sake of simplicity, let me ask the question via an analogy.

Say you have:

  • A set of bank accounts. We assume that the the number of accounts can grow and shrink at any time.
  • A fixed pool of workers/consumers. These workers are responsible for executing actions on a bank account, such as a withdrawal.
  • A user interface, which provides the user the possibility to initiate an action, such as a withdrawal.
  • An AMQP exchange.

When the user initiates a withdrawal, a message encapsulating this action is given to the AMQP exchange. The exchange puts the message in a queue, and it will be picked up by one of the available workers consuming this queue.

But what happens if the user 'immediately' initiates another withdrawal? The corresponding message might be picked up by another worker, resulting in two workers processing the same bank account simultaneously.

How do you synchronize between the workers to guarantee a happen-before relation? How do you prevent that one worker is working on the same bank account then another?

As I am fairly new to architectural design, this might be a generic question. I am happy with any external useful resource (book, ...) that might help me here.

Was it helpful?

Solution

So, You have two issues here

1: Prevent/Resolve Simultaneous actions (withdrawals) on a single resource (bank account)

2: Enforce processing actions (withdrawals) in a particular order

In your example, you could resolve 1 in a number of ways, but lets take the simplest. Processing the withdrawal calls ReduceBalance(accountId, amount) on some bank account service which processes requests synchronously.

You don't care which withdrawal happens first because whichever happens second will fail due to lack of balance

Resolving 2 is more complex, lets say that there is some business logic to the order in which the withdrawals must take place. Those from account X occurring in the same time window as some from account Y must happen first.

First you have to determine the order by applying this logic which involves some form of communication between workers. I think this is best achieved by a MasterWorker (MW) process who's only job is to manage the various worker processes (WP) and route jobs to their queues.

  1. WP -> contacts MW and registers availablity for work
  2. MW -> pulls job from queue and applies ordering logic (in this case make a collection of jobs that occur in a time box and order them. this can be done by shunting them to a holding queue)
  3. MW -> send first of ordered jobs to a WP and associate that WP with the queue of ordered jobs (ie. remember who is working on them)
  4. WP -> process job
  5. WP -> send job complete message to MW and re-register for work
  6. MW -> because I remember the WP i assigned that job to I know that it is complete and the second job becomes processable
  7. MW -> send job 2 to a WP....

All the various messages and state can be accomplished via queues, and distributed async processing is still possible, ie you can have several sets of messages on the go at once. The only tricky bit is ensuring that you only have one MW running at a time. Rabbit MQ has a method of exclusive lock on a queue which can be used, but in my view is a bit flaky.

OTHER TIPS

There may be another solution than one proposed by Ewan, though his is excellent. Suppose you have a pool of queues. At any moment, a queue is either free (not being used at all) or assigned to a specific bank account.

You still need a master worker to allocate a queue for use, assign it to a bank account, and make it available to the next free subordinate worker.

Now assume as well that you have a mechanism to ensure that a worker who pulls a work item out of a queue, uses a synchronization mechanism to ensure that no other worker can access that queue while the work item is processed. After the work item is processed, the queue is released to be worked on by any worker.

The queue remains assigned to one bank account until it is empty. It then becomes free and returns to the free queue pool.

This is very similar to the solution proposed by Ewan. The main difference is you will need to allocate more queues, and you can have fewer workers.

Licensed under: CC-BY-SA with attribution
scroll top