Is there a method of quickly determining whether a chunk read from a (sparse) file is all zeros?

StackOverflow https://stackoverflow.com/questions/7818439

  •  26-10-2019
  •  | 
  •  

Question

Is there a method of quickly determining whether a (4KB-16MB) chunk read from a file is all zeros?

You can iterate over the chunk, checking each byte. There are obvious optimisations, but it remains O(N).

My use case is for sparse files. I would be perfectly happy for a partial solution, such that if the chunk I've just read is not backed by any disk storage (i.e. it is a hole) then return true.

Any hints?

Was it helpful?

Solution

This depends on the operating system and sometimes the filesystem. Linux since 2.6.28 has implemented the FIEMAP ioctl(), and ZFS on Solaris implements SEEK_HOLE and SEEK_DATA in lseek().

OTHER TIPS

My first thought was, "How does rsync do it?"

It turns out that rsync simply checks the data for blocks of zeroes, and writes them as sparse files. See fileio.c in the rsync source code if you want the gory details.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top