I was planning to compare MD5s of ZIP files to determine if I needed to upload an OSGi bundle JAR during deployment. I assumed that if the files inside the bundle JARs were the same, then the bundle JARs themselves would be the same. Between builds, I found that the timestamps you described were the only in-file differences between builds. After using shell script to pull out those timestamps -- which made all files identical -- I found that the bundle JARs were still different due to file timestamps.
I ended up comparing unzip -lv
output for the two bundle JARs to determine equality:
lhash=$(unzip -lv $HOME/staging/$bundle | sed -ne '/---/,/---/p' | sed -e '1d;$d;' | awk '{L="";for(i=1;i<NF;i++){if(i<5 || i>6){L = L " " $bundle}}print L}' | md5)
rhash=$(ssh -i $HOME/.ssh/keys/keyfile.pem user@$host "unzip -lv ~ubuntu/bundles/$bundle | sed -ne '/---/,/---/p' | sed -e '1d;\$d;' | awk '{L=\"\";for(i=1;i<NF;i++){if(i<5 || i>6){L = L \" \" \$bundle}}print L}' | md5sum | awk '{print \$1}'")
if [ "$lhash" = "$rhash" ]
then
different=f
else
different=t
fi
My local machine is a Mac and the remote machine is running ubuntu, hence the md5
versus md5sum
. The awk
nonsense is to remove the timestamps from the unzip -l
output. After the code finishes, if different
is t
, then the files are different; otherwise, the files are the same.