I'm just in the process of doing some research into NTFS and Windows and I'm a little confused as to how I should handle NTFS sparse files. I'm currently looking at the $UsnJrnl, which is used for update transaction journalling.

It is my understanding that a sparse file is just like any other file within the file system, however the file will contain large sections of zeros, and rather than writing zeros to the disk and essentially wasting space, only a count of the number of clusters containing zero is stored.

As an example, the data runs for the $UsnJrnl on my test system are: (obtained using Winhex)

Cluster start: 0
Number of clusters: 1408
(Sparse)

Cluster start:  510119
Number of clusters: 128

Cluster start:  256
Number of clusters: 2448

This means that the $UsnJrnl file is occupying a total of 3984 clusters on the disk, however 1408 of those are sparse, so they aren't actually present on the disk.

So does this mean that the 1408 zero filled clusters are immediately before the 128 clusters starting at 510119?

Essentially what I'm trying to do is to to be able to determine the exact start and end offset of the file on the disk, e.g it runs from cluster x to cluster 512822, however I'm not sure if the sparse clusters are actually allocated directly before the second cluster run, making it one contiguous block, of if they could actually be allocated anywhere.

I hope that makes sense, and any information or advice would be greatly appreciated!

有帮助吗?

解决方案

No, it means that $UsnJrnl occupies 2576 clusters on disk. Sparse clusters don't occupy any space on disk, if you'd try to read sparse cluster, e.g. cluster 10 in your example, NTFS just returns zeros.

Generally, you can't determine start and end cluster of the file, since files can be fragmented - your example says that first 1408 clusters are not allocated on disk at all, then 128 clusters of that file occupy disk clusters 510119 - 510247, then 2448 clusters of the file occupy disk clusters 256 - 2704; so in this case you can't say that file begins by cluster X (on disk) and ends by cluster Y (on disk) - it's possible only if file is not fragmented (when it uses only one cluster run).

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top