Question

I have the following simple function to append text to some txt file. The length of the text in the code is 1024:

void AppendToFile(String filename)
{
    String text = "0,1,0,0,1,0,1,1,1,0,1,0,1,0,0,1,0,1,1,1,0,1,1,0,1,0,1,1,1,1,0,1,0,1,0,0,1,1,1,1,0,1,0,0,0,0,0,1,1,1,0,1,1,0,1,0,1,0,1,1,0,1,1,1,0,0,0,0,1,0,0,1,1,0,0,0,1,0,1,0,0,0,1,0,1,1,0,0,1,0,0,1,0,1,1,0,1,0,1,0,0,0,0,1,0,0,1,1,0,1,0,0,0,1,0,0,0,0,1,1,0,0,1,0,1,1,1,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,1,0,1,1,0,0,1,1,0,0,0,0,1,1,1,1,0,1,0,1,1,1,1,0,0,0,1,1,0,1,1,0,0,0,1,0,1,1,1,0,1,1,1,1,0,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,0,1,0,1,1,1,1,1,0,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,0,1,0,0,1,0,1,1,0,1,0,1,0,1,0,1,1,0,1,1,1,1,0,0,1,1,1,1,1,0,1,0,0,0,0,1,0,0,0,1,1,0,0,0,1,1,1,1,0,1,1,1,0,1,1,1,1,0,0,1,0,0,0,0,1,0,1,0,1,1,0,1,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,1,1,0,1,1,1,1,1,1,1,0,1,0,0,1,1,0,0,1,0,1,1,1,0,1,1,1,1,0,0,1,0,1,1,0,0,0,1,1,1,1,1,0,1,0,0,1,0,1,0,0,1,1,0,0,0,0,1,0,1,1,1,1,1,0,1,0,1,1,0,1,1,0,1,1,1,0,0,0,1,1,0,1,0,1,1,0,1,0,0,1,1,0,1,1,1,1,0,0,1,1,1,0,1,1,1,1,1,0,0,1,0,1,1,0,1,0,0,1,0,1,0,1,0,0,1,0,0,1,1,1,1,0,0,1,1,1,1,0,0,1,1,0,1,0,1,1,1,0,0,1,1,";
    System.out.println(text);

    PrintWriter out = null;
    try {
        out = new PrintWriter(new BufferedWriter(new FileWriter(filename, true)));
        out.println(text);
    } catch (IOException e) {

    } finally {
        out.close();
    }
}

The printing to the console works fine. However, when I open the file - it seems like

ⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰱⰰⰱⰰⰰⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰱⰱⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰰⰰⰱⰰⰱⰱⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰰⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰱⰱⰱⰰⰱⰰⰱⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰱⰰⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰱⰱⰰⰰⰰⰰⰱⰰⰰⰰⰰⰱⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰰⰰⰰⰱⰱⰱⰰⰰⰰⰰⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰰⰰⰱⰰⰰⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰰⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰰⰰⰱⰰⰰⰱⰰⰱⰰⰱⰱⰰⰱⰱⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰱⰰⰰⰱⰰⰱⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰰⰰⰱⰱⰱⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰰⰱⰱⰰⰰⰰⰰⰱⰰⰱⰱⰱⰱⰱⰰⰱⰰⰱⰱⰰⰱⰱⰰⰱⰱⰱⰰⰰⰰⰱⰱⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰱⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰰⰱⰱⰱⰱⰱⰰⰰⰱⰰⰱⰱⰰⰱⰰⰰⰱⰰⰱⰰⰱⰰⰰⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰱⰱⰰⰰⰱⰱⰰⰱⰰⰱⰱⰱⰰⰰⰱⰱ਍

For shorter string, like:

String text = "0,1,0,0,1,0,1,1,1";

or for another very long string, e.g., 1024 times 'a' it works fine (so the reason is not the length of the string).

I can't understand this. Do you have any explanation?

Was it helpful?

Solution

The problem is with Notepad. I believe it is still incorrectly detecting the encoding, although Wikipedia claims this is fixed in Windows 7.

In all my tests I compiled and run with Java 1.6.0_45 on Windows 7 64-bit. Also the system property file.encoding = Cp1252.

With your original code, the file produced is detected by Sublime Text as UTF-8 but (importantly) the Byte Order Mark (BOM) is missing. Opening the same file in Notepad shows the character placeholder square. Re-saving the file in Sublime Text with the BOM then opening in Notepad gives the expected characters.

Replacing 0s and ,s with as and opening in Notepad, I see Chinese (I think) characters which fits in with the Wikipedia information as I guess I have the correct font. So the encoding is detected incorrectly. Attempting to Save as the Notepad file, the encoding listed is Unicode which is really UTF-16 Little Endian (UTF-16LE) - see Setting the default Java character encoding?

Replacing 0s with as and opening in Notepad, I see squares again, since the incorrectly detected encoding has not matched a valid character.

Replacing all characters with as works because the detected encoding is ANSI. You can see this by trying a Save as in Notepad and observing the Encoding drop down.

From How to add a UTF-8 BOM in java, I added out.write('\ufeff'); to write the BOM before the out.println(text);, but with my default encoding the result in Notepad started with a ? since again Notepad was failing to correctly detect the encoding. It was again detected as ANSI, although at least the rest of the characters displayed as expected.

Adding -Dfile.encoding=UTF-8 and out.write('\ufeff'); finally produced a file that Notepad could decode and display as expected.

OTHER TIPS

FileWriter uses the system default encoding and in your case which is probably is NOT set to UTF-8. One way of fixing this problem by setting system property

-Dfile.encoding=UTF-8

which will make FileWriter to use UTF-8

This is more than likely an issue with Notepad.

Notepad (at least on Windows 7, where I've replicated your issue) has a max line length of 1024 characters. By adding another 0 to the end of your string it prints fine, although it wraps the last character onto a new line.

It is also unlikely to be an encoding issue since by replacing all 0s with As and all 1s with Bs you actually get a similar error:

 ⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱢⱢⱢⱡⱢⱡⱢⱡⱡⱢⱢⱢⱢⱡⱢⱡⱡⱡⱡⱡⱢⱢⱢⱡⱢⱢⱡⱢⱡⱢⱡⱢⱢⱡ

Again adding or removing a character, will print fine.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top