문제

I'm getting this error:

NameError: name 'htmltext' is not defined

It comes from the code below:

from bs4 import BeautifulSoup
import urllib
import urllib.parse

url = "http://nytimes.com"

urls = [url]
visited = [url]

while len(urls) > 0:
        try:
           htmltext = urllib.urlopen(urls[0]).read()
        except:
           print(urls[0])      

        soup = BeautifulSoup(htmltext)    
        urls.pop(0)

        print(soup.findAll('a',href = true))
도움이 되었습니까?

해결책

In Python 3.x, you have to import urllib.request instead of urllib. Then, change the line:

htmltext = urllib.urlopen(urls[0]).read()

to:

htmltext = urllib.request.urlopen(urls[0]).read()

Finally, change true to True.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top