Question

mates!

I have trouble reading from file Cyrillic text using RandomAccessFile.

Here's a simple program, that writes information in specific file (Cyrillic words) using such format:

keyLength, valueLength, key, value

Then program tries to read this information, but my output is incorrect:

writing success
keyLength = 10, valueLength = 4
read: килло, гр

UPD Expected output:

writing success
keyLength = 10, valueLength = 4
read: киллограмм, сала

What is the problem? (except problem that I have small brain)

import java.io.FileNotFoundException;
import java.io.RandomAccessFile;
import java.io.IOException;

public class Main {

    public static void main(String[] args) {
        String fileName = "file.db";
        RandomAccessFile outputFile = null;

        try {
            outputFile = new RandomAccessFile(fileName, "rw");
        } catch (FileNotFoundException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        String key = "киллограмм";
        String value = "сала";

        try {
            outputFile.writeInt(key.length());
            outputFile.writeInt(value.length());

            outputFile.write(key.getBytes("UTF-8"));
            outputFile.write(value.getBytes("UTF-8"));
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        System.out.println("writing success");

        RandomAccessFile inputFile = null;

        try {
            inputFile = new RandomAccessFile(fileName, "r");
        } catch (FileNotFoundException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        int keyLength = 0, valueLength = 0;

        try {
            keyLength = inputFile.readInt();
            valueLength = inputFile.readInt();
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }

        System.out.println("keyLength = " + keyLength + ", valueLength = " + valueLength);
        if (keyLength <= 0 || valueLength <= 0) {
            System.err.println("key or value length is negative");
            System.exit(1);
        }

        byte[] keyBytes = null, valueBytes = null;

        try {
            keyBytes = new byte[keyLength];
            valueBytes = new byte[valueLength];
        } catch (OutOfMemoryError e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        try {
            inputFile.read(keyBytes);
            inputFile.read(valueBytes);
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

        try {
            System.out.println("read: " + new String(keyBytes, "UTF-8") + ", " + new String(valueBytes, "UTF-8"));
        } catch (IOException e) {
            System.err.println(e.getMessage());
            System.exit(1);
        }

    }
}
Was it helpful?

Solution

The issue is this

outputFile.writeInt(key.length());

String#length()

Returns the length of this string. The length is equal to the number of Unicode code units in the string.

In this case, it returns the value 10, which is not the number of byte required to represent this String.

What you want is

key.getBytes("UTF-8").length

used as

byte[] keyBytes = key.getBytes("UTF-8");
outputFile.writeInt(keyBytes.length);

Same for the value.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top