Question

I am trying to convert html to pdf using pisa. I am using the following line of code -

pisa.CreatePDF(htmlCode, pdfFile, xhtml=True )

I get the following error. pdf creation failed with error 'module' object has no attribute 'XHTMLParser'

I have html5lib 1.0b3 installed. It used to work before but something happened (may be I updated some of the modules). So does any one know why I keep getting the above error?

When I do not pass the "xhtml=True", the call succeeds but the pdf generated is all wrong. Can I get around this somehow? Is it possible to convert a web page from xhtml to html?

How do I know whether a particular page is in xhtml or not?

The last two questions might not make sense because I do not write html code and can only read it.

Thanks for any help.

Was it helpful?

Solution

There is no XHTMLParser in html5parser, and the source code of pisa indicates that the xhtml=True flag is permanently broken:

if xhtml:
    #TODO: XHTMLParser doesn't see to exist...
    parser = html5lib.XHTMLParser(tree=treebuilders.getTreeBuilder("dom"))

Fortunately, XHTML is often valid HTML as well, so you don't need any conversion. Therefore, simply find out why the pdf generated is all wrong - XHTML is not the problem here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top