Question

I have a database with binary these strings

record no 1: 1111111111111011000100110001100100010000000000000011000000000000
record no 2: 1111111111111111111111100001100000010000000000000011000000000000
record no 3: 1110000011110000111010001110111011110000111100001100000011000000
...

So, i want to find out what record had similar bỉnary string with this: 1111111111111011000100110001100100010000000000000011000000001100

You can see, the record number 1 is 98% relevance. record number 2 is 70% relevance, and record number 3 is only 45% percent relevance.

This is huge database (200.000 records)...

Was it helpful?

Solution

SELECT * FROM MY_TABLE ORDER BY BIT_COUNT(CAST(CONV(record,2,10) as unsigned integer) ^ CAST(b'11...0' as unsigned integer)) LIMIT 1;

The above query will return the most similar record.

You can also SELECT the BIT_COUNT, it's min=0 means identity (record=input) or 100%, it's max=64 means that all bits differ (record = ~input) or 0%.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top