Question

I have a set of geographically remote nodes with heterogeneous operating systems which need to transfer files and updates around using a Java program I am writing. At present I need to send the entire file again if the file changes. Is there a way to determine the sections of the files that are different and only send those (note that these files are not necessarily text, they could be any format). The only way I can think of is to split the file into blocks, hash the blocks and send the hashes back the the requester which then requests only the blocks it needs but for small blocks and large files this is a large overhead so is there any way to send some message describing my file such that the singular message can be analysed to provide a list of the blocks that need to be transmitted?

Most digest functions are designed such that a small change to the data results in a large change over the whole hash output, I basically need the reverse of this, that will work on all operating systems.

Was it helpful?

Solution

If I understand your question correctly, you need to keep files in sync on two systems. There is a tool called rsync that can synchronize two files (or whole directories) by only sending the changes made to the file.

You may also be interested in the Rsync algorithm.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top