Computing file HASH returns different values
Question
Does anybody know, why the following code returns different results on some machines?
Private Shared Function ComputeHashValue(ByVal Data As String) As String
Dim HashAlgorithm As SHA512 = SHA512.Create
Dim HashValue() As Byte = HashAlgorithm.ComputeHash(Encoding.ASCII.GetBytes(Data))
' Looping over the array and ANDing each byte with 0111111
For i As Integer = 0 To HashValue.Length - 1
HashValue(i) = HashValue(i) And Convert.ToByte(127)
Next
Return Encoding.ASCII.GetString(HashValue)
End Function
Private Shared Function AreByteArraysEqual(ByVal array1 As Byte(), ByVal array2 As Byte()) As Boolean
If array1.Length <> array2.Length Then Return False
For i As Integer = 0 To array1.Length - 1
If array1(i) <> array2(i) Then Return False
Next
Return True
End Function
Private Shared Sub SomeMethod()
Dim t_prvBytes() As Byte = New Byte() {SOME VALUES} 'Previously computed HASH
Dim t_dllStream As New IO.FileStream("C:\myfile.txt", IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read)
Dim t_reader As New IO.StreamReader(t_dllStream)
Dim t_dllHash() As Byte = System.Text.Encoding.Unicode.GetBytes(ComputeHashValue(t_reader.ReadToEnd))
MsgBox(AreByteArraysEqual(t_dllHash, t_prvBytes))
t_dllStream.Close()
End Function
Solution
You shouldn't be converting the hash into text via Encoding.ASCII
. It's not ASCII text. (It's not text at all.) You're also hashing the result of ASCII-encoding the original text, which you read in using Encoding.Unicode
. Why?
You're doing all kinds of conversions between text and binary forms - and you probably shouldn't be doing any. Just hash the binary data (using HashAlgorithm.ComputeHash(Stream)
), and keep the result in binary too. If you really need to convert the binary data into text, use Convert.ToBase64String
.
In addition, you're comparing the data with a previously computed value - but you haven't explained where that previously computed value came from to start with.