Question

I'm converting our iOS app to Android (first time with Android, but long time Java programmer). There's a web service that provides 2 JSON feeds to the application. This web service is written in Python, and the first JSON string is outputted as 'ascii'. This is fine, and the Android app downloads it fine and displays fine. The problem comes with the second one.

Since the JSON is prone to containing non-english characters (accents, punctuation etc), I've outputted it in Python as 'utf-16'. I'm downloading the content as follows in the Android app:

new DownloadTask(new Downloader.Callback() {
        @Override
        public void finishedDownloading(String content) {

            final City[] cities = new Gson().fromJson(content, City[].class);
            Downloader.cities = cities;
            System.out.println("Found " + cities.length + " cities");
            getActivity().runOnUiThread(new Runnable() {
                @Override
                public void run() {
                    setListAdapter(new CityArrayAdapter(getActivity(),
                            R.layout.listview_item_row,
                            cities));
                    pb.dismiss();
                }
            });
        }
    }).execute(Constants.CITIES_URL);

Download Task:

protected String doInBackground(String... sUrl) {
    BufferedReader br = null;
    try {
        URL url = new URL(sUrl[0]);
        br = new BufferedReader(new InputStreamReader(url.openStream()));
        String line = br.readLine();
        String doc = "";
        while (line != null) {
            doc += line + "\r\n";
            line = br.readLine();
        }
        br.close();
        callback.finishedDownloading(doc);

        return doc;
    } catch (MalformedURLException e) {
        System.out.println("Exception: " + e.getMessage());
    } catch (IOException e) {
        System.out.println("Exception: " + e.getMessage());
    }
    return null;
}

I've been reading up about how Java handles Strings, and apparently a String is stored as UTF-16, so I'm not sure why this isn't working properly?

Just to mention about errors, Gson throws an error, but only due to the String being incorrectly displayed. When I've printed the url response to the console, it comes out with '?'s every other character (indicating an encoding error).

Was it helpful?

Solution

Your problem is the InputStreamReader. You should be explicitly telling it what charset to use instead of using the platform default, which is not what you want. Ideally, you should be reading the Content-Type header and using that to pick the charset intead of hardcoding utf-16 (LE or BE?).

To clarify your thoughts about Java using utf-16 internally, you are correct, but the issue is that you need to convert bytes to characters and that has nothing to do with how Java internally handles String.

Also, you might want to think about using utf-8 as that tends to be the default unicode encoding on the web.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top