I worked around the error by recognizing the encoding manually. I peeked the XML header and looked for the encoding
attribute (if available), extracted as a String, created a Java Charset
object from it by Charset.forName()
, then made a Reader with the given encoding and an InputSource over that Reader like this:
String encoding;
Charset charset;
[...]
Reader reader = new BufferedReader(new InputStreamReader(inputStream, charset));
InputSource inputSource = new InputSource(reader);
inputSource.setEncoding(encoding);
SAXParserFactory.newInstance().newSAXParser().parse(inputSource, myHandler);
Unfortunately I still don't know why the encoding could not be recognized automatically by the parser.