DISABLE ADBLOCK

ADBlock is blocking some content on the site

ADBlock errore

converting ms word document's special characters to html

StackOverflow https://stackoverflow.com/questions/10811758
  •   encoding
  •  | 
  •  
  •  | 
  •   ( words)

Question

I have word document and following code which is converting doc into html using Apache POI Api.

   serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");

   serializer.setOutputProperty(OutputKeys.METHOD, "html");

   serializer.transform(domSource, streamResult);         

but the numbering i.e 1), a), i) etc. and bullet points characters are not being parsed correctly, I am getting garbage characters like 1? and when I open the html file in the editor I get numbers with unwanted boxes. I have tried a lot but I don't get proper solution of this.

Please help me out in order to get rid of this encoding issue.

Thanks

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow