StringBuilders ending with mass nul characters

https://stackoverflow.com/questions/5394353

28-10-2019
|

Question

I'm having a very difficult time debugging a problem with an application I've been building. The problem itself I cannot seem to reproduce with a representitive test program with the same issue which makes it difficult to demonstrate. Unfortunately I cannot share my actual source because of security, however, the following test represents fairly well what I am doing, the fact that the files and data are unix style EOL, writing to a zip file with a PrintWriter, and the use of StringBuilders:

public class Tester {

    public static void main(String[] args) {
        // variables
        File target = new File("TESTSAVE.zip");
        PrintWriter printout1;
        ZipOutputStream zipStream;
        ZipEntry ent1;
        StringBuilder testtext1 = new StringBuilder();
        StringBuilder replacetext = new StringBuilder();
        // ensure file replace
        if (target.exists()) {
            target.delete();
        }
        try {
            // open the streams
            zipStream = new ZipOutputStream(new FileOutputStream(target, true));
            printout1 = new PrintWriter(zipStream);
            ent1 = new ZipEntry("testfile.txt");
            zipStream.putNextEntry(ent1);

            // construct the data
            for (int i = 0; i < 30; i++) {
            testtext1.append("Testing 1 2 3 Many! \n");
            }
            replacetext.append("Testing 4 5 6 LOTS! \n");
            replacetext.append("Testing 4 5 6 LOTS! \n");

            // the replace operation
            testtext1.replace(21, 42, replacetext.toString());

            // write it
            printout1 = new PrintWriter(zipStream);
            printout1.println(testtext1);
            // save it
            printout1.flush();
            zipStream.closeEntry();
            printout1.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

The heart of the problem is that the file I see at my side is producing a file of 16.3k characters. My friend, whether he uses the app on his pc or whether he looks at exactly the same file as me sees a file of 19.999k characters, the extra characters being a CRLF followed by a massive number of null characters. No matter what application, encoding or views I use, I cannot at all see these nul characters, I only see a single LF at the last line, but I do see a file of 20k. In all cases there is a difference between what is seen with the exact same files on the two machines even though both are windows machines and both are using the same editing softwares to view.

I've not yet been able to reproduce this behaviour with any amount of dummy programs. I have been able to trace the final line's stray CRLF to my use of println on the PrintWriter, however. When I replaced the println(s) with print(s + '\n') the problem appeared to go away (the file size was 16.3k). However, when I returned the program to println(s), the problem does not appear to return. I'm currently having the files verified by a friend in france to see if the problem really did go away (since I cannot see the nuls but he can), but this behaviour has be thoroughly confused.

I've also noticed that the StringBuilder's replace function states "This sequence will be lengthened to accommodate the specified String if necessary". Given that the stringbuilders setLength function pads with nul characters and that the ensureCapacity function sets capacity to the greater of the input or (currentCapacity*2)+2, I suspected a relation somewhere. However, I have only once when testing with this idea been able to get a result that represented what I've seen, and have not been able to reproduce it since.

Does anyone have any idea what could be causing this error or at least have a suggestion on what direction to take the testing?

Edit since the comments section is broken for me: Just to clarify, the output is required to be in unix format regardless of the OS, hence the use of '\n' directly rather than through a formatter. The original StringBuilder that is inserted into is not in fact generated to me but is the contents of a file read in by the program. I'm happy the reading process works, as the information in it is used heavily throughout the application. I've done a little probing too and found that directly prior to saving, the buffer IS the correct capacity and that the output when toString() is invoked is the correct length (i.e. it contains no null characters and is 16,363 long, not 19,999). This would put the cause of the error somewhere between generating the string and saving the zip file.

Solution

Finally found the cause. Managed to reproduce the problem a few times and traced the cause down not to the output side of the code but the input side. My file reading function was essentially this:

char[] buf;
int charcount = 0;
StringBuilder line = new StringBuilder(2048);
InputStreamReader reader = new InputStreamReader(stream);// provides a line-wise read
BufferedReader file = new BufferedReader(reader);
do { // capture loop
    try {
    buf = new char[2048];
    charcount = file.read(buf, 0, 2048);
    } catch (IOException e) {
    return null; // unknown IO error
    }
    line.append(buf);
} while (charcount != -1);
// close and output

problem was appending a buffer that wasnt full, so the later values were still at their initial values of null. Reason I couldnt reproduce it was because some data filled in the buffers nicely, some didn't.

Why I couldn't seem to view the problem on my text editors I still have no idea of, but I should be able to resolve this now. Any suggestions on the best way to do so are welcome, as this is part of one of my long term utility libraries I want to keep it as generic and optimised as possible.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow