I strongly suspect you're not using readUTF
the way it's expected to be used. Did you read exactly what it does?
The first two bytes are read, starting from the current file pointer, as if by readUnsignedShort. This value gives the number of following bytes that are in the encoded string, not the length of the resulting string. The following bytes are then interpreted as bytes encoding characters in the modified UTF-8 format and are converted into characters.
Does that match what's stored in your file? (You haven't specified anything about the format of the file.)
Given that UTF-8 is not fixed width, it sounds inappapropriate for your scenario.
I'd suggest using 32 bytes per entry, which will always give 16 char
values as UTF-16 code units. You can convert this very simply using new String(data, "UTF-16BE")
and text.getBytes("UTF-16BE")
(or use LE
instead of BE
if you want). That way you'll have a genuinely fixed-length string, in terms of bytes, not just characters.