Your code is using the platform default character encoding when reading the url content. Instead, you need to pass an explicit character encoding to the InputStreamReader
. This should be the encoding specified by the url itself (this should be included in the "Content-Type"
header). if the character encoding is not included in the relevant header, then you need to pick an appropriate default.
Accented characters : difference before and after compilation
-
04-06-2022 - |
Question
I made a light apps which read the HTML code of a page and display it to the user.
During the developement whith NetBeans, no problems at all, but when I use the .jar given by this IDE after a "Clean Build", I have some troubles with the accents.
For exemple, the french word "renégocier"
, was displayed as such under NetBeans.
But with the clean build of NetBeans, the word is displayed "renégocier"
...
Any idea?
EDIT : this is how I read the HTML code :
URL urlObject=null;
URLConnection con=null;
String inputLine;
String codeHTML
urlObject = new URL(UrlToVerification);
con = urlObject.openConnection();
BufferedReader webData = new BufferedReader(new InputStreamReader(con.getInputStream()));
while ((inputLine = webData.readLine()) != null)
{
codeHTML += inputLine; // Lecture du code HTML
}
SOLUTION :
Replace:
BufferedReader webData = new BufferedReader(new InputStreamReader(con.getInputStream()));
with :
BufferedReader webData = new BufferedReader(new InputStreamReader(urlObject.openStream(), "UTF-8"));
Solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow