Question

def main():


  openurl = urllib2.urlopen("http://www.pythonforbeginners.com")
    content = openurl.read()
    code = openurl.code

    soup = BeautifulSoup(content) #I think I need to change something here!!
    print soup
    if soup.body.find(text=re.compile('python', re.IGNORECASE)):
      print "i think it's working"
    openurl.close()

How can I modify this code to allow me to use the lxml parser in combination with Beautiful Soup to find a keyword within the body of a website? Note that the above code works, but it is not using the parser I want it to.

Was it helpful?

Solution

To use lxml as your parser supply 'lxml' as a second argument.

soup = BeautifulSoup(content, 'lxml')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top