What can du see that rsync can't?
-
10-07-2019 - |
Question
I want to copy an entire linux server that is going to be decommissioned over the network so we are sure nothing is lost.
I did du /
and was told there are 60 GB of under /
Then I did rsync -r / root@newserver:/old-server
and when doing du
in the old-server
dir I got 22 GB.
So why is that difference? Is there something that du
can see but rsync
can't copy?
Solution
You probably have deleted files that can't yet be deallocated because there are open filehandles on them. (I didn't previously know that du would see the usage from those, but some testing showed that it does.) You can research this using lsof. The two main causes of this from my experience are deleting Apache logs without kicking the httpd and deleting mysql tables from the filesystem rather than by using DROP TABLE.
OTHER TIPS
If you've got a bit of time on your hands, you could figure out exactly what the difference is: run cd /; find . > /tmp/old
on the old server, cd /old-server; find . > /tmp/new
on the new server, then vimdiff
the two files to see what's changed.
There're some special filesystems which you should avoid copying with rsync
, for example, /proc
, /sys
, /tmp
. They may account for the difference you see, although, it seems too big anyway.
There could be some unreadable directories (for example, without r
or x
on them). I don't remember whether process running with root
rights can access such directories without fixing permissions first.
Better generate and compare list of files and their md5 sums.
Some suggestions:
- sparse files (use -S)
- hard links (use -H)
/proc
and/sys
(use --exclude or, better, -x and backup each filesystem separately)
I tend to use rsync -axHSW --numeric-ids
in similar circumstances.