Pregunta

I have .txt file and i want to convert this file to UCS-2 format
what should be correct way to convert
File is about 700mb so can not open in Notepad ++ n convert

Please suggest .

¿Fue útil?

Solución

OK, so, first of all: Notepad++ shows ANSI, and ANSI is not a character encoding. According to this SO answer and various others, it appears that it is Windows-1252.

As to UCS-2, it has been superseded by UTF-16 which can encode more code points. Anyway, at the time UCS-2 was defined, it encoded more code points than Windows-1252, so using UTF-16 is OK here.

However, UTF-16, as USC-2 did, depends on endianness. We will assume little endian here.

Therefore:

final Path src = Paths.get("/path/to/original/file.txt")
final Path dst = Paths.get("/path/to/destination/file.txt");

final char[] buf = new char[1 << 20]; // 1 MB char buffer
int nrChars;

try (
    final BufferedReader reader = Files.newBufferedReader(src, 
        Charset.forName("windows-1252"));
    final BufferedWriter writer = Files.newBufferedWriter(dst,
        StandardCharsets.UTF_16LE, StandardOpenOption.CREATE);
) {
    while ((nrChars = reader.read(buf, 0, buf.length)) != -1)
        writer.write(buf, 0, nrChars);
    writer.flush();
}

This should work.

Otros consejos

/*this is how I achieved in java it is almost same as above */

                    PrintWriter writer = new PrintWriter("destinationfile.txt",                       
                     "UTF-16LE");
        writer.write("\uFEFF");

        File srcfile = new File("sourceFile.txt");
        FileReader fileReader = new FileReader(srcfile);
        BufferedReader br = new BufferedReader(fileReader);
        String line;
        try {
            while ((line = br.readLine()) != null) {
                writer.write(line);
                writer.write("\r\n");
            }
            writer.close();
            br.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top