Question

After reading many related posts on this site on the subject of image comparison I'm thinking I'll try implementing a PCA on each image in order to see if an image is 'similar' or not, but I'm not sure how to get the data from my images - is there a VB function that I can use to convert the image into an array of bytes or something in order to compare images? Or a simpler way to compare two images (should be black and white but they'll be scanned, v. small images)

Thanks very much, Becky

Was it helpful?

Solution

Also, here is a useful article: this guy took two images, compared them, then created a third image that graphically represented the difference between the two. It appears to be a nice visual way to depict similiarity.

OTHER TIPS

To see if they're identical or not is quite easy using roygbiv's answer. To see how similar they are is quite complicated. If these are scanned documents they're really never going to be identical. It may be worthwhile to invest in a third party option. We use products in our scanning process from Accusoft and TiS.

That said, there are a couple of potential duplicate questions.

You can use something:

Public Class MyClass
    Shared  Sub Main(ByVal args() As String)
       Byte() mydata = File.ReadAllBytes("C:\MyFile.jpg")
    End Sub
End Class

So, this is what I came up with. Rather than compare the pixels individually, I used a hashing algorithm fed from the contents of the file. It then compares the individual bytes of the returned hash. In my tests, it came back twice as fast as comparing the individual pixels for a gray-scale bitmap image 1152 X 720 and 101KB big.

Here's the code:

(editing because the first time I posted the code everything looked strange. removed comments.)

Public Shared Function CompareTwoImageHashes(ByVal pathToFirstImage As String, ByVal pathToSecondImage As String) As Boolean

    Dim firstImage As FileInfo = New FileInfo(pathToFirstImage)
    Dim secondImage As FileInfo = New FileInfo(pathToSecondImage)

    If Not firstImage.Exists Then
        Throw New ArgumentNullException("pathToFirstImage", "The file referenced by the path does not exist!")
    End If

    If Not secondImage.Exists Then
        Throw New ArgumentNullException("pathToSecondImage", "The file referenced by the path does not exist!")
    End If

    Dim hashingTool As SHA256Managed
    Dim imagesMatch As Boolean = True

    Try

        Using firstImageStream As New FileStream(firstImage.FullName, FileMode.Open)
            Using secondImageStream As New FileStream(secondImage.FullName, FileMode.Open)

                hashingTool = SHA256Managed.Create()

                Dim imageOneHash As Byte() = hashingTool.ComputeHash(firstImageStream)
                Dim imageTwoHash As Byte() = hashingTool.ComputeHash(secondImageStream)

                hashingTool.Clear()

                If (imageOneHash.Length = imageTwoHash.Length) Then

                    For length As Integer = 0 To (imageOneHash.Length - 1)

                        If imageOneHash(length) <> imageTwoHash(length) Then
                            imagesMatch = False

                            Exit For
                        End If

                    Next

                    CompareTwoImageHashes = imagesMatch
                Else
                    CompareTwoImageHashes = False
                End If

            End Using
        End Using

    Catch ex As Exception

        Console.WriteLine("Error during compare: {0}", ex.Message)

    End Try

End Function

For retrieving the pixel data of an image; you can either use Bitmap.GetPixel or Bitmap.LockBits, which will give you a BitmapData (link has example code) class in return.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top