LMAX Replicator Design - How to support high availability?

Question

Your intuition is correct about folding the replication and journalling into the Raft component. But, the Raft protocol dictates exactly when things need to be stored on disk.

Here are two different ways to look at it.

I'm assuming there is no is hefty computation, such as a transaction processing, before the replication because you don't have any in your diagrams.

I, personally, would do the first because it separates concerns into different processes. If I was implementing Raft for myself I would take the first half of the second scenario and put it in its own process.

External Raft Replication

In which Raft is implement by an external process.

The replication component outsources to an external Raft process the business of replication. After some time, Raft responds to the replication component that it is, in fact, replicated. The replication component updates the items in the ring buffer, and moves its published cursor forward. The business logic sees the published cursor (via waitFor) and consumes the freshly replicated data.

In this scenario, the replication component probably has a lot of inflight events, so it's read cursor is far ahead of the cursor it publishes to the business logic.

There is no need for a journalling component in this scenario because the external raft system does the journalling for you.

Note, the replication may be the slowest component of the system!

Integrated Raft Replication

In which raft is implemented in the same process as the "Real Business Logic."

In terms of Raft, replication is the business logic. Actually, you have multiple levels of business logic, or equivalently, multiple stages of business logic.

I'm going to use two input disruptors and two output disruptors for this to emphasize the separate business logic. You can combine, split, or rearrange to your heart's content. Or your profiler's content.

The first stage, as I mentioned, is Raft replication. Client events go into the Replication Input Disruptor. The Raft logic picks it up, perhaps in batches, and sends out to the Followers on the Replication Output Disruptor. All Raft messages also go into the Replication Input Disruptor. The Raft logic also picks these up and sends the appropriate responses to the appropriate Followers/Master on the Replication Output Disruptor).

A journaller component hangs off the Input Ring Buffer; it only has to handle certain types of messages as dictated by Raft. This will likely be the slowest part of the system.

When the data is considered replicated, it is moved to the second stage, via the "Real Business Logic" Input Disruptor. There it is processed, sent to the Client Outbound Disruptor, and then sent to one of your millions of happy paying customers.