Question

I have a database that I extract data from to my Android project. There are some strings of text there with the swedish letters å, ä, ö written as : √•=å, √§=ä, √δ=ö. What would be the best way of converting these symbols to the actual letters before I print them on the textview in the app? Is a substitution, like replace √• with å, the way to go? How would that be entered in the query that is now fetching the data:

public Cursor getAlternative1(long categoryid, int questionid) {
                final String MY_QUERY = "SELECT question, image, alternative, questionid, correct FROM tbl_question a INNER JOIN tbl_alternative b ON a._id=b.questionid AND b.categoryid=a.categoryid WHERE a.categoryid=? AND a._id=?";

                Cursor cursor = mDb.rawQuery(MY_QUERY, new String[]{String.valueOf(categoryid), String.valueOf(questionid)});
                if (cursor != null) {
                      cursor.moveToFirst();
                 }
                return cursor;
            }

Thanks for any help!

Was it helpful?

Solution

It appears that your string data was originally encoded in UTF-8, but are getting misinterpreted as MacRoman.

The first thing to do is make sure your data is being stored in the database correctly. You can use SELECT HEX(SomeColumn) to see the raw bytes that are being stored for the string. The default encoding in SQLite is UTF-8, so a correctly-encoded string will have C3A5 for å, C3A4 for ä, and C3B6 for ö. If you see E2889AE280A2, E2889AC2A7, E2889AE28882, then the misinterpretation of the characters (å→√•, ä→√§, ö→√δ) is happening before the data gets into the DB. If you just see 8C, 8A, and 9A, then the reverse misinterpretation is being made.

If your database is correct, then it's probably an I/O routine that thinks the system encoding is UTF-8 when it's really MacRoman. Try something like System.setProperty("file.encoding", "macintosh");.

OTHER TIPS

This is a little old post, but if you are importing the data into sqlite using windows cmd shell, try doing this in the shell:

c:> chcp 65001

This will change the code page of cmd shell to utf 8

c:> sqlite3 database.db < inserts.sql

where inserts.sql is a UTF-8 (WITHOUT BOM CHARACTER!!) sequence of inserts. You can create that kind of file using Notepad++

Hope it helps

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top