Java Inflate inconsistent with large Strings

Question 1

I agree with the commenters, that a compressed string should better be byte[]. However with a single-byte encoding like ISO-8859-1 one might abusively convert between byte[] and String.

The following differs from your version, in that it explicitly indicates the encoding. For text UTF-8 is adequate to have no limits and cover the full Unicode range.

Note the usage of the deflate return value.

public static String compress(String s) {
    Deflater def = new Deflater(9);
    byte[] sbytes = s.getBytes(StandardCharsets.UTF_8);
    def.setInput(sbytes);
    def.finish();
    byte[] buffer = new byte[sbytes.length];
    int n = def.deflate(buffer);
    return new String(buffer, 0, n, StandardCharsets.ISO_8859_1)
            + "*" + sbytes.length;
}

public static String decompress(String s) {
    int pos = s.lastIndexOf('*');
    int len = Integer.parseInt(s.substring(pos + 1));
    s = s.substring(0, pos);
    
    Inflater inf = new Inflater();
    byte[] buffer = s.getBytes(StandardCharsets.ISO_8859_1);
    byte[] decomp = new byte[len];
    inf.setInput(buffer);
    try {
        inf.inflate(decomp, 0, len);
        inf.end();
    } catch (DataFormatException e) {
        throw new IllegalArgumentException(e);
    }
    return new String(decomp, StandardCharsets.UTF_8);
}

Question 2

The problem is not with Deflater.

The primary problem is this line:

    rta = new String(buffer);

What you are doing is taking an array of bytes (representing the compressed input string) and decoding it into a String using your platform's default character encoding. This is wrong. For the majority of character encodings, there are byte values of sequences of byte values that cannot be mapped to characters. When you attempt to "decode" bytes that don't represent properly encoded text, you are liable to get a scattering of question marks or some other character throughout the string. This results in loss of information ... and there's no way to recover it.

(There are one or two character sets where the decoding / encoding is fully reversible ... and you could use one of them as the encoding scheme when converting the compressed bytes to "text". But that's not the end of it!)

The second problem with how you are dealing with the compressed bytes. The deflate(byte[] buffer) method compresses the input data and writes the compressed output into buffer. However, there is no guarantee that N bytes of input is going to result in N bytes of output. Instead the deflate method returns an int giving the number of bytes written into buffer.

But your code is then taking the entire buffer ... including the bytes that weren't written ... and turning that into a String (by the unsound procedure described above). You then trim the String to (I presume) get rid of the trailing NUL characters. But that will trim all white-space from the start and end, and some of those characters could be a significant part of the compressed string.

Basically, what you are doing is unsound. You should not be trying to convert arbitrary bytes into a String. Compressed data is NOT text.

My recommendation is to do one of the following:

Don't convert the (compressed) byte[] to a String. Keep it as a byte[] ... and deal with the length issue properly.
Alternatively, use a non-lossy bytes-as-characters encoding scheme; e.g. hex encoding or base64 encoding.