Question

I'm having problems understanding how input to the md5 function in MySQL 4.1.22 is handled. Basically I'm not able to recreate the md5sum of a specific value combination for comparison. I guess it has something to do with the format of the input data.

I have set up a table with two columns of type double(direction and elevation) + a third for storing a md5 sum.

With a setup script i add data to the direction and elevation fields + create a checksum using the following syntax:

insert into polygons (
  direction, 
  elevation, 
  md5sum
) 
values ( 
  (select radians(20.0)), 
  (select radians(30.0)), 
  ( md5( (select radians(20.0)) + (select radians(20.0)) ) )
)

which ends up as: 0.349065850398866, 0.523598775598299, '0c2cd2c2a9fe40305c4e3bd991812df5'

Later I compare the stored md5sum with a newly calculated, the newly calculated is created using md5('0.349065850398866' + '0.523598775598299') and I get the following checksum: '8c958bcf912664d6d27c50f1034bdf34'

If I modify the last decimal in the passed "string" from a 9 to a 8 0.523598775598298 i get the same checksum as previously stored, '0c2cd2c2a9fe40305c4e3bd991812df5'. Any value of the last decimal from 8 down to 0 gives the same checksum.

Using BINARY, (md5( (select BINARY(radians(20.0))) + (select BINARY(radians(20.0))) ) in the setup script creates the same checksum as the my original "runtime calculation"

Worth mentioning is that the original method works for all other rows I have (55)

I guess I'm using the function in a somewhat strange way, but I'm not sure of a better way, so in order to find a better way I feel I need to understand why the current is failing.

Was it helpful?

Solution

The two numbers you are adding are stored in binary form, but displayed in decimal form. There is no guarantee that you will get exactly the same number back if you give the decimal form back to the machine.

In this case, this causes the addition to give a slightly different result, which gives an entirely different MD5 sum:

mysql> select radians(20.0) + radians(30.0), '0.349065850398866' + '0.523598775598299';
+-------------------------------+-------------------------------------------+
| radians(20.0) + radians(30.0) | '0.349065850398866' + '0.523598775598299' |
+-------------------------------+-------------------------------------------+
|              0.87266462599716 |                          0.87266462599717 | 
+-------------------------------+-------------------------------------------+
1 row in set (0.00 sec)

If you want to consistently get the same result, you need to store the results of radians(20.0) and radians(30.0) in variables somewhere, never relying on their printed representations.

OTHER TIPS

The output of radians(20.0) is computed with many more digits than are shown in the printed output. When the result is passed to the md5 function, the full non-truncated value is used, whereas the printed value only will show a limited number of digits. Thus, it is not the same value being passed into the md5 function in the two cases.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top