Pergunta

The general question is what kind of mechanism can I use to transfer data to and from publishers and subscribers where publishers or subscribers can be permanently offline? Can message queues be used for this?

Possible approach

I am thinking a message queue style approach where the online service (publisher) publishes messages to a queue online, copy the queue offline, and have the offline service (subscriber) then process all incoming messages. I need to design a solution where the same applies with the roles reversed, where the offline service is the publisher and the online service is the subscriber

The data is physically transported between the networks by way of storage medium. Once the storage medium is plugged into a network a data transfer service reads the storage medium and moves all of the data from the storage medium to the queues.

Current approach

The way I currently do it is a simple table copy where

  1. The database has triggers that listen for inserts, updates, or deletes, makes note of the event in a database event table (db_publish_events)
  2. Later a data transfer service reads all published events from db_publish_events and then copies the entire row to a JSON file with a tag of INSERT, UPDATE, or DELETE
  3. Then the JSON file is manually transported to the subscribing system and then processed by a data transfer service. Each record transferred in is marked in a database event receiving table (db_received_events).
  4. A JSON file is downloaded from the subscriber of acknowledgment receipts of all events received by the subscriber.
  5. The JSON file with receipts is sent back to the original publishing database and changes the state of the db_publish_event to mark it as received so the publisher will stop sending it.

Pros

Simple table copy across a network

Cons

No data integrity across tables or business event boundaries since one record is transferred at a time.

Solution

I am thinking of a solution where the entire event is transferred as one message, so before complex business rule processing splits an event into multiple tables.

Is there a way for commercial or open source messaging software (RabbitMQ) to do a simple USB copy of messages for publishing in an offline queue?

Foi útil?

Solução

The general question is what kind of mechanism can I use to transfer data to and from publishers and subscribers where publishers or subscribers can be permanently offline? Can message queues be used for this?

There is a concept of durable queues in some messaging systems, e.g., in RabbitMQ. But if your subscriber is offline for quite a long time, these queues can obviously get overloaded. And if your application is data-intensive, it might happen pretty soon. Besides that, queues are slower when they are overwhelmed (I guess it's a common trait of many messaging systems, not only RabbitMQ).

But the better option for you is Kafka. Kafka has the concept of topic, which roughly maps to the concept of queue in RabbitMQ. And it's perfectly fine to store data there. Here is a good intro in Kafka. Cited from there:

What makes Kafka unique is that Kafka treats each topic partition as a log (an ordered set of messages). Each message in a partition is assigned a unique offset. Kafka does not attempt to track which messages were read by each consumer and only retain unread messages; rather, Kafka retains all messages for a set amount of time, and consumers are responsible to track their location in each log. Consequently, Kafka can support a large number of consumers and retain large amounts of data with very little overhead.

I'm not aware of the size and load of your system, but probably the simpler option could be simple http callbacks. So if the subscriber is down -- it's ok, the publisher gets http code 500 and goes on in a couple of minutes.

Licenciado em: CC-BY-SA com atribuição
scroll top