Pattern for syncing databases with undo option

https://softwareengineering.stackexchange.com/questions/419303

18-03-2021
|

Question

I work on a large and old application consisting of a server and a fat client. Part of what the application does is handle a large-ish (a few 100MBs) database of frequently changing data (~a dozen rows per second). Because of the size, there is a master copy of the database on the server and a local copy on every client and they need to be synced. Changes can either be caused by outside events coming to the server or they can come from user interactions. In the first case, the server updates the master DB and propagates that to the clients. In the second case, the client sends a request-to-change to the server, the server updates the DB and propagates that to all clients, including the original one. In addition, clients have a "undo"-feature that allows to reverse-apply the couple of last changes the user caused.

We use JPA/Hibernate as an ORM layer between the database and our code, both on the client and the server side. But there are different database backends on both sides.

At the moment, our solution is old, half-baked legacy code: Diffs between objects are calculated based on string representations of their attributes. The corresponding old/new pairs of string-values get distributed for syncing and stored in a separate table for later undos. Lots of things can and sometimes do go wrong the way it is now.

What is the preferred way of doing this? I've looked through some Hibernate docs and tutorials, but it seems there is no ready made solution with JPA that does this out of the box. I could probably design something that's less half-baked, maybe three-quarters-baked with @Audit and Entity Listeners. But I'm assuming that some smart people have already come up with some design pattern that realises this. Can someone please point me in the right direction?

No correct solution

OTHER TIPS

Sounds like a nightmare.

In theory, if you publish the events to a message queue you can have them processed against the master database and have undo event work fine.

However, with multiple clients all processing transactions I'm not sure I would trust it not to go out of sync. Really each client should potentially be getting a lock on all databases before writing to maintain atomicy, (this would I assume kill any performance advantage).

I think I would go with making the local DBs read only caches and having all writes processed by an API which writes to the master DB.

This will allow you to use simple one way replication from the master and undos will work fine.

If the local DBs needs to see the local writes prior to getting a sync from the master, then you have the bigger problem of them not seeing the updates from other clients, which they need, and your system is already broken.

This would be a step on the way to getting rid of the local DBs entirely and running everything through the api, which can cache, cache invalidate and scale better than a central DB.

But imagine that is the "Big Rewrite" option, which is never palatable.

OK I'm going to add a second answer in the assumption that the replication of the data isn't really the problem at all. Its applying an undo operation.

First off, you have to take into account that given your "external world" changes, the undo may be impossible. If you have sent the email, its sent, you cant undo.

However, given that you can write some rules to handle those cases the general case can be handled by variations of

Command pattern
Event Sourcing

I wont go into too much detail but essentially both of these patterns keep track of changes to the object rather than just the object state.

Thus you can recreate the object to any point in time by simply replaying the changes, and hence "undo" changes

eg.

new order
add PS5
add plug
set address
amend card holder name
remove plug
add furbie
make payment
send order to delivery centre
post order

So you can see by storing each event or command, you can recreate the object by simply replaying the events from the start to a desired point.

Obviously if your events had real world consequences such as making a payment, or posting a package, recreating the object as it was before those events isnt helpful. you have to apply a reverse event, a repayment or returns process etc.

But for things like undoing adding an item, or editing a document etc it works well.

But I'm assuming that some smart people have already come up with some design pattern that realises this.

And they have. The Memento pattern was one of the original design patterns in the GoF book of the same name. As stated in the wikipedia article:

The memento pattern is a software design pattern that provides the ability to restore an object to its previous state (undo via rollback).

It's not specific to your situation but, conceptually, it provides both a explanation of the kinds of issues you are running into and strategies for solving them.

The general approach of storing a series of diffs seems to potentially be aligned with the above. If there was only one client apply changes, I think this can be made to work. The big challenge is if you have other users making changes to the same or related entities. What does that mean for the other client's changes, especially in the case that their change depended on a change that another user has now undone?

What specific kinds of issues occur? Are they related to multiple users making modifications? Is there a flaw in the string representation of the object and/or the interpretation of the associated diffs? Something else?

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange