By the sounds of it (you haven't given us the relevant code so I can't be certain) you aren't handling character set conversion properly. Java doesn't perform any automatic character set conversions for you - you've got to do it yourself.
You can do the following to convert it to UTF-8:
String utf8Text = new String(originalText.getBytes("UTF-8"), "UTF-8");
This assumes that originalText
is a String
containing the Windows-1252 encoded text.