Question

If I open an html file base_result.htm with pyquery, it returns [None], and throws errors when I search it. If I use that same file as a string, everything works well.

>>> d = PyQuery(filename = 'base_result.html')
>>> d
[None]
>>> f = open('base_result.html')
>>> d = PyQuery(f.read())
>>> d
[<html>] 
Was it helpful?

Solution

Its an open issue in PyQuery: https://github.com/gawel/pyquery/issues/22

Some workarounds are mentioned in above link, such as:

>>> from lxml.html import parse
>>> parse("index.html")
<lxml.etree._ElementTree object at 0x108a72f38>
>>> pq(parse("index.html").getroot())

or

>>> f = open('index.html')
>>> d = PyQuery(f.read())
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top