Question

I want to write converter that gives me full control over converting PDF to ePUB? What i am looking is that the converter

  1. -should not cut the words in the end line. If a text is one paragraph then epub must be one paragraph without any additional space characters and cutted words.
  2. -should recognize the bookmarks in pdf document and should create a table of content (TOC) in epub version.
  3. should resize the images.
  4. should be in UTF8, its a must to be compatible for Turkish characters.

Please Suggest me any Library for this?

Was it helpful?

Solution

You might find Calibre (http://calibre-ebook.com/) of use. It supports PDF input, export to ePub and has command line utilities. I suspect, though, to achieve what you want you will need to pre-process the PDF with one of the standard PDF libraries to extract the required metadata, resize the images etc.

If you do go to the trouble of doing extensive pre-processing on the PDF you may as well write the ePub directly since it will be a relatively small additional step and removes the dependency on calibre or similar.

Let us know if you do find some good PDF-ePub libraries.

OTHER TIPS

I have not had a go at it myself, but you may want to have a look at this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top