There are several things going on:
msg.getBytes()
returns the bytes representing an encoding of the string using the "platform's default charset" (e.g. could be UTF-8 or UTF-16 or ..): specify the encoding manually to avoid confusion! In any case, seemsgBytes.length
to get the true plain text length.DES, being a block cypher, will have output padded along a block size boundary - but this will always be larger than the plain text (refer to
msgBytes.length
) length when using PKCS#5 because the plain text is always padded with [1,8] bytes. To see what the true encrypted size is, seetextEncrypted.length
.The encrypted bytes are encoded using base-64 and this process - which is independent of the encryption - inflates the number of bytes required by about 33% (as only 6 bits per character/byte are used). The Java base-64 implementation also adds padding which is where the trailing "=" character is introduced.
As long as you (or someone else with the correct algorithm and cipher key) can retrieve the initial string - by performing the inverse of each step in reverse order, then it works. If a particular step does not have an inverse/reverse operation or cannot be "undone", then something is wrong; but this also means that every step can be individually tested.
To the numbers!
msg.getBytes()
returns an ASCII/UTF-8 encoded sequence (if it used UTF-16 or another another "wide" encoding then the numbers below would be too large)- Therefore,
msgBytes.length
is 24 - And since
msgBytes.length
mod 8 is 0, the plain text is padded with 8 bytes that have the value of 0x08 (per CKCS#5) - Thus,
textEncrypted.length
is 32 (24 data + 8 padding) - Due to base-64 encoding, 32 bytes * 1.33 ~ 43 characters
- And with base-64 padding (
=
), the final result is 44 characters!