Command line utility or C# code to compute 200,000 file changes in a directory hierarchy quickly?

StackOverflow https://stackoverflow.com/questions/2142975

  •  22-09-2019
  •  | 
  •  

Question

Using Microsoft FCIV which computes SHA-1 file checksums, I created a text file with file names and checksums:

"8697c58c606122c30e2a20f1eabd6919" "g:\00258\99481\99481.eps"
"b77a6b392c002bb9cc51f48170487dea" "g:\00258\99481\99481.eps"

My intent is to create a Jpeg thumbnail for any images that change. However, this utility takes hours to generate a list. I wanted to use SHA-1 because the Git folks find it useful (1 in 2^52 chance of collision, 5 commas). MD5 produces several collisions with that sample size. I want to use the SHA-1 as a unique identifier too.

I need to quickly identify file changes and re-generate thumbnails only for changed files. I would like to get these values in to SQL. Any suggestions? (For that matter, I need to read the image loading keywords in to SQL). Time is difficult because twice a year, Microsoft's file creation and modification times change by an hour.

Was it helpful?

Solution

Why don't you look at the file modification time as a first step and then if that's different do a hash. That way you won't be doing the (expensive) hash for every file.

You could also look at the file size as an additional check.

Also you could regenerate all the hash twice a year when the clocks change.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top