Question

Here's my testing :

...$ md5sum -b roy.html 
f9283ca2833ff7ebb6781ab8d23a21aa *roy.html
...$ md5sum -t roy.html 
f9283ca2833ff7ebb6781ab8d23a21aa  roy.html

Is there any different between these two mode ?

Was it helpful?

Solution

‘-b’ ‘--binary’

  • Treat each input file as binary, by reading it in binary mode and
    outputting a ‘*’ flag. This is the inverse of --text. On systems like GNU that do not distinguish between binary and text files, this
    option merely flags each input mode as binary: the MD5 checksum is
    unaffected. This option is the default on systems like MS-DOS that
    distinguish between binary and text files, except for reading
    standard input when standard input is a terminal.

‘-t’ ‘--text’

  • Treat each input file as text, by reading it in text mode and outputting a ‘ ’ flag. This is the inverse of --binary. This option is the default on systems like GNU that do not distinguish between binary and text files. On other systems, it is the default for reading standard input when standard input is a terminal. This mode is never defaulted to if --tag is used.

OTHER TIPS

I am finding some interesting differences between binary mode and non-binary mode.

My use case is that I am trying to create 256-bit AES keys for use on AWS S3 block storage service. These keys are used to support server side encryption (SSE). I have spent hours (almost days) trying to figure out why my code was unable to interact with S3, never having suspected my keys as the problem. Actually, generating the key was not the problem. I was able to generate the binary key and the base64 encoded version of the binary key quite easily.

What the problem was was quite surprising to me. I am no stranger to md5, I have used it for decades without fail. But it turns out that the md5 sum/hash I was generating based on the binary key was wrong. My first indication was that it was a few characters longer than what I was seeing in a working example that I was looking at. I had been unable to create an md5 sum that was as short as the example, and I had no idea why there would be a difference.

I found that:

OSX (bsd) md5 has no concept of binary input mode. OSX (bsd) md5sum has a flag for binary input mode, but it does not change the actual outputted hash, it only changes the metadata related to that hash.

Alpine Linux md5 does have a concept of binary input mode. Alpine Linux md5sum has no concept of binary input mode.

Debian Linux md5 seems to not exist Debian Linux md5sum has a flag for binary input mode, but it does not change the actual outputted hash, it only changes the metadata related to that hash.

For example, I get these outputs when running:

OSX:

openssl rand 32 > key
cat key | md5
936e87c3f08e54d036c7a38dc9dbd540
cat key | md5sum
936e87c3f08e54d036c7a38dc9dbd540  -
cat key | md5sum -b
936e87c3f08e54d036c7a38dc9dbd540 *-

Alpine Linux:

openssl rand 32 > key
cat key | md5
915b2c6c3368c19f96e9a79089389c15
cat key | md5 -b
kVssbDNowZ+W6aeQiTicFQ==
cat key | md5sum
915b2c6c3368c19f96e9a79089389c15  -

Debian Linux:

openssl rand 32 > key
cat key | md5sum
a44f9c1d1f7a35f2374ad2987296b54b  -
cat key | md5sum -b
a44f9c1d1f7a35f2374ad2987296b54b *-

I am finding that (at least) what AWS S3 is expecting is the md5 of a binary key that is output like what Alpine Linux is doing in the case of:

cat key | md5 -b
kVssbDNowZ+W6aeQiTicFQ==

I will try to reach out to Sören Tempel of Alpine Linux to try to find out what is going on with these differences.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top