pHash yields weird results for some images

Question

I think the results can be explained:

For the first pair of images, the main "attack" on the image is a re-framing, which significantly alters the frequency data that the dct hash is created from. This is a known weakness of the dct hash approach, and is documented on the pHash website.

The "similarity" of the second pair of images is a probably result of the small file size, and the large block of a single color in one of the images. In my subjective experience, these types of files often lead to weird "similarities" popping up. (Images of brand names were problematic for me). Unfortunately, I can't really explain this unexpected behavior.

Using multiple hashing methods (like mexican hat or radial) and larger source files, if available, can help reduce the "false match" rate.

Update:

I have since experimented with the newly released phash functionality in ImageMagick. It allows you do diff two images with the command line call compare -metric phash image1 image2 diffimage.

Using this tool, the similarity score for the first set of (similar) images is 19.78, while the score for the obviously dissimilar images is 258.58. The value suggested as a "match threshold" is 21. This pHash method incorporates color information, unlike the dct hash.

Information about the new functionality here: http://www.fmwconcepts.com/misc_tests/perceptual_hash_test_results_510/index.html