Question

I'm wondering why all databases seem to store hashed passwords and typical hexadecimal data as char(x)/varchar(x)

MySQL uses latin1_swedish_ci as default character encoding, where each character is made of 8 bits.

When storing hexadecimal data as strings, you're only using 4 bits of those 8 bits you can use, as there are only 16 usable characters.

So am I not seeing something here? Is there a character encoding specifically for this kind of stuff using 4 bits per character? Or are some companies actually making use of all the reserved space to store hexadecimal stuff at least, doesn't matter how?

Was it helpful?

Solution

Firstly, hexadecimal is not really a storage format... it is a display format. You can choose to store hexadecimal strings in a database, if you wish, but the more natural (and more common) technique is to store the actual numbers in a database using the database's native numeric integer format, and then format the retrieved numbers as hexadecimal for display purposes.

Hash functions return a set of bits. Traditionally, these bits are provided as hexadecimal because it is easier to read than zeroes and ones. In addition, many hashes contain a larger number of bits than native numeric types can support (MD5 is 128 bits, for example), and so hex is a more natural choice. Given an arbitrary number of hex digits, it's simpler and easier to just store the hex that the hash function provides.

Further Reading
Why do we use hex output for hash functions?
Why do most hashing functions produce hashes that have characters a-f 0-9?

Licensed under: CC-BY-SA with attribution
scroll top