Question

I'm using jodconverter and OpenOffice 3.3 to convert a docx file to pdf using the following command:

java -jar jodconverter-cli-2.2.2.jar test.docx test.pdf

It works fine when trying to convert most of the languages, but I can't convert docs written in Chines or Japanese. When doing so, I get gibberish instead of text:

enter image description here

I tried to install Japanese and Chines OpenOffice language packs as well as using jodconverter 3.0 beta 4 but i got the same output.

Test file

How can I add support for those languages?

Is there another tool (preferably opensource) i could use?

Was it helpful?

Solution

It appears that there is a bug in the OO version I was using. I got it to work by doing the following:

  1. Update to latest OpenOffice.
  2. Download the missing ttf files
  3. Follow this guide (or this) to installing Windows True Type Fonts in Linux

To make sure that the fonts are installed correctly, run:

fc-list

To search for a specific language, use :lang=. For example, for Hindi language:

fc-list :lang=hi

I got the conversion to work after adding the MingLiU.ttf font and the baekmuk-ttf-fonts rpm package.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top