質問

I use Scrapy framework to crawl data. My crawler will be interrupted if it encounters a 500 error. So I need to check an available link before I parse a web content.
Is there any approach to resolve my problem?
Thank you so much.

役に立ちましたか?

解決

If the url exists you could use the getcode() method of urllib to check it:

import urllib
import sys

webFile = urllib.urlopen('http://www.some.url/some/file')
returnCode = webFile.getCode()

if returnCode == 500:
  sys.exit()

# in other case do something.
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top