Question

I'm trying to convert single pieces of HTML code to the XML Format the *.odt format (Open Office) is using. For example, <p>This is some text</p> should be translated to <text:p>This is some text</text:p>. Of course, this should also work with lists etc.

I'm not sure whether the best way to go would be using a XSLT processor (and if so, which one for Java?) and create the stylesheet myself – isn't there a Java library out there that can already do this? I'm using jodconverter to go from ODT->PDF, but even though OpenOffice Writer can handle copy&pasting the content and display it in the desired way, jodconvert doesn't seem to be able to "translate" single pieces of HTML (or am I wrong about that?).

Any ideas and suggestions would be very welcome. I should add that I'm absolutely new to Java. Thanks in advance Ingo

Was it helpful?

Solution

XSLT is the best way to do it. The OpenDocument group is working on a HTML to ODT xsl template. Sadly, it is not ready yet.

You can check on their website to stay in touch (and get beta work maybe).

Otherwise, you have non official project, also based on XSLT: like this one It would be easy to apply a little transformation on your HTML to get a valid XHTML before processing it to ODT.

Or just check this other example.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top