Is there a method of quickly determining whether a chunk read from a (sparse) file is all zeros?
-
26-10-2019 - |
문제
Is there a method of quickly determining whether a (4KB-16MB) chunk read from a file is all zeros?
You can iterate over the chunk, checking each byte. There are obvious optimisations, but it remains O(N).
My use case is for sparse files. I would be perfectly happy for a partial solution, such that if the chunk I've just read is not backed by any disk storage (i.e. it is a hole) then return true.
Any hints?
해결책
This depends on the operating system and sometimes the filesystem. Linux since 2.6.28 has implemented the FIEMAP
ioctl()
, and ZFS on Solaris implements SEEK_HOLE
and SEEK_DATA
in lseek()
.
다른 팁
My first thought was, "How does rsync do it?"
It turns out that rsync simply checks the data for blocks of zeroes, and writes them as sparse files. See fileio.c
in the rsync source code if you want the gory details.