Question

I'd like to know how to compare two files to determine if it is exactly the same one. I know how to compare filename, date of creation/modification and even hash if required.

However I don't know how to compare meta data on the file (I actually don't know how it is stored) : security configuration, compatibility settings, potential antivirus timestamp and so on.

my final goal is to deep compare two file systems on separate computers

thanks steve

[edit] in order to clarify I reformulate the title of the question

Was it helpful?

Solution

What constitutes a file? On modern filesystem (say NTFS) you have

  • file atttibutes (times, FAT attributes)
  • unnamed file stream
  • zero or more alternate data streams (ADS)
  • Extended Attributes
  • NTFS security (it's stored in ADS yet we can identify it separately)

The rest (configuration, antivirus timestamp /what's this/ etc) is stored outside of the file and is not the file.

So you need to check the above mentioned bits of the file and compare them.

Different methods exist for reading different bits of information and you need to use them all to get all them together and compare them for different files.

OTHER TIPS

Just work through all the getters on System.IO.File.

GetAccessControl
GetAttributes
GetCreationTime
...
ReadAllBytes

If there's anything else that your definition of "same file" depends on (like the absolute path if on different machines), then get that as well, but you haven't made clear what that is.

You need to MD5, SHA hash both files and compare if both have the same sum.

Check the MD5CryptoServiceProvider and SHA512CryptoServiceProvider in System.Security.Cryptography.

It's something like this:

private string ComputeHashAsText(byte[] fileBytes)
{
    using (SHA512CryptoServiceProvider cryptoService = new SHA512CryptoServiceProvider())
    {
        return Encoding.ASCII.GetString(cryptoService.ComputeHash(fileBytes));
    }
}

public bool CompareFiles(string pathA, string pathB)
{
    string hashPathA = ComputeHashAsText(File.ReadAllBytes(pathA));
    string hashPathB = ComputeHashAsText(File.ReadAllBytes(pathB));

    return hashPathA == hashPathB;
}

In an actual solution you may want to compute the hash in chunks or something like this, because maybe files to compare are too big to read all bytes to memory and hash them.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top