MediaWiki API and encoding
-
16-09-2019 - |
Question
I'm using the MediaWiki API to update some pages with an experimental robot. This robot uses the Java Apache HTTP-client library to update the pages.
(...)
PostMethod postMethod = new PostMethod("http://mymediawikiinstallation/w/api.php");
postMethod.addParameter("action","edit");
postMethod.addParameter("title",page.replace(' ', '_'));
postMethod.addParameter("summary","trying to fix this accent problem");
postMethod.addParameter("text",content);
postMethod.addParameter("basetimestamp",basetimestamp);
postMethod.addParameter("starttimestamp",starttimestamp);
postMethod.addParameter("token",token);
postMethod.addParameter("notminor","");
postMethod.addParameter("format","xml");
int status = httpClient.executeMethod(postMethod);
(...)
However the 'content' string contains some accents. System.out.prinln(content)
looks OK, but the accentuated characters in the wiki look bad. E.g. 'Val�rie' instead of 'Valérie'.
How can I fix this?
Solution
OK, changing the request header fixed the problem.
postMethod.setRequestHeader( "Content-Type", "application/x-www-form-urlencoded; charset=utf-8");
OTHER TIPS
In my PHP code to talk to the Mediawiki API I used urlencode to encode the title parameter, and this seems to work fine.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow