Question

I'm working with continuously updated ~20GB Interbase backups that i want to replicate over the internet. How can i minimise data transferred?

I was considering using a binary diff tool, but i understand bsdiff requires O(7n) memory at least, and these Interbase backups are only changing incrementally over the LAN using Interbase's proprietary gbak thing anyway. Is there any way i can hook into the linux filesystem (ext/btrfs/...) to capture all changes made to this file, export that as a common diff format, and reassemble it on a different (windows) platform?

Was it helpful?

Solution

How about InterBase databases incremental backup feature? You can try to do the incremental backup (for log files) to an temporary dump location and then backup that incremental data alone to the offsite location. Any way, you may need to keep the initial full backup data to proceed with the incremental backup of InterBase databases.

It will give you the very little amount of data to be backed up.

OTHER TIPS

you might be able to use rsync. If changes in the database happen to be saved to the end of the backup file, it will be perfect.

however, if the backup file gets highly rewritten (I mean with many small chunks/rows inserted/deleted/modified randomly), rsync will not do the job. It depends on the frequency of the synchronization with respect to that of the insertions/deletions in your database.

there are tools such as xdelta which might help in this case, as they use a windowed approach to the delta computation and might be able to find common pieces much smaller than rsync and thus keep the common part although the presence of a higher density of changes. you'll need a 'old' and the latest backup to use this.

The good news is that backup will probably be organized in the same way each time it is executed (same tables/rows order) and it will help both algorithms.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top