Check 500 error to by pass

https://stackoverflow.com/questions/12200715

web-scraping
python-2.7
scrapy
scrapyd

29-06-2021
|

Question

I use Scrapy framework to crawl data. My crawler will be interrupted if it encounters a 500 error. So I need to check an available link before I parse a web content.
Is there any approach to resolve my problem?
Thank you so much.

Solution

If the url exists you could use the getcode() method of urllib to check it:

import urllib
import sys

webFile = urllib.urlopen('http://www.some.url/some/file')
returnCode = webFile.getCode()

if returnCode == 500:
  sys.exit()

# in other case do something.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow