I agree with the commenters, that a compressed string should better be byte[]
.
However with a single-byte encoding like ISO-8859-1 one might abusively convert between byte[]
and String
.
The following differs from your version, in that it explicitly indicates the encoding. For text UTF-8 is adequate to have no limits and cover the full Unicode range.
Note the usage of the deflate
return value.
public static String compress(String s) {
Deflater def = new Deflater(9);
byte[] sbytes = s.getBytes(StandardCharsets.UTF_8);
def.setInput(sbytes);
def.finish();
byte[] buffer = new byte[sbytes.length];
int n = def.deflate(buffer);
return new String(buffer, 0, n, StandardCharsets.ISO_8859_1)
+ "*" + sbytes.length;
}
public static String decompress(String s) {
int pos = s.lastIndexOf('*');
int len = Integer.parseInt(s.substring(pos + 1));
s = s.substring(0, pos);
Inflater inf = new Inflater();
byte[] buffer = s.getBytes(StandardCharsets.ISO_8859_1);
byte[] decomp = new byte[len];
inf.setInput(buffer);
try {
inf.inflate(decomp, 0, len);
inf.end();
} catch (DataFormatException e) {
throw new IllegalArgumentException(e);
}
return new String(decomp, StandardCharsets.UTF_8);
}