I have a Python 2.7 script that consists of opening/scraping multiple URLs from a database and extracting some information out of the webpages. The code can take up to 8 hours to run and I am dealing with multiple websites.

Every now and then (1-2 hours), I randomly get the error message IOError: [Errno socket error] [Errno 10060] while trying to open a URL

IOError: [Errno socket error] [Errno 10060] A connection attempt
failed because the connected party did not properly respond after a
period of time, or established connection failed because connected
host has failed to respond

I tried to avoid the error by adding a 2 sec pause with time.sleep(2) between the URL opening operations but I still get the error. The error seems to be independent of the website from which I am trying to open the URL.

I was looking at a way to prevent my script from crashing using a try/except statement.
In the event of a socket error, the script would pause for something like 20 sec and then, retry to open the URL. If the URL opens correctly, move on with the script. I'm using urlopen() to open the URLs.

有帮助吗?

解决方案

As your code raises an IOError, run this code, but substitute your line of error for the raise.

try:
    raise IOError
except IOError:
    time.sleep(20)
    pass
else:
    break

其他提示

Since this is so rare and only happens every hour or so, you can probably blame your internet connection. The code you are looking for is:

import time
for url in urls:
    while True:
        try:
            response = urllib2.urlopen(url)
            #Do stuff
        except :
            time.sleep(20)
        else :
            #stops the inner loop if there is no error
            break
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top