Question

I want to read an Unicode file (UTF-8) and write it back to another file.

Code I used for reading is, (As in Textscreen in Codename One, how to read text file?)

final String textFile = "/readme.txt";
String text = "";

InputStream in = Display.getInstance().getResourceAsStream(null, textFile);

if (in != null){
    try {
        text = com.codename1.io.Util.readToString(in);
        in.close();
    } catch (IOException ex) {
        System.out.println(ex);
        text = "Read Error";
    }
}

I even tried

text = com.codename1.io.Util.readToString(in,"UTF-8");

and

DataInputStream dis = new DataInputStream(in);
text = com.codename1.io.Util.readUTF(dis);

But I am not Unicode is not getting read.

For writing I am doing,

String content = "Some Unicode String";
OutputStream stream = fs.openOutputStream(path + "/" + fileName);
stream.write(content.getBytes());
stream.close();

and tried,

DataOutputStream dos = new DataOutputStream(stream);
dos.writeUTF(content);

I observed generated file is ANSI encode.

Update: Solution

As per @Shai's reply,

Read:

// For text file in package structure
InputStream in = Display.getInstance().getResourceAsStream(null, "/" + textFile);

// For file in file system
InputStream in = fs.openInputStream(textFile);


if (in != null) {
  try {
      text = com.codename1.io.Util.readToString(in, "UTF-8"); // Encoding
      in.close();
  } catch (IOException ex) {
      text = "Read Error";
  }
}

Write:

OutputStream stream = fs.openOutputStream(textFile);
stream.write(content.getBytes("UTF-8"));
stream.close();
Was it helpful?

Solution

The readToString() method reads with UTF-8 encoding. If you encoded the file in one of the ASCII/ANSI encoding you need to either fix it for UTF-8 or specify the specific encoding to that method.

readUTF from DataInputStream is something completely different designed for encoded streams and not for text files. DataInputStream in general is not designed for text files in Java, you should be using Reader/InputStreamReader for that sort of stuff.

getBytes() uses the platform specific encoding which is rarely what you want you should use getBytes(String).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top