Question

I'm aware that urllib2 is available on Google App Engine as a wrapper of Urlfetch and, as you know, Universal Feedparser uses urllib2.

Do you know any method to set a timeout on urllib2?
Is the timeout parameter on urllib2 been ported on Google App Engine version?

I'm not interested in method like:

rssurldata = urlfetch(rssurl, deadline=..)
feedparser.parse(rssurldata)
Was it helpful?

Solution

There's no simple way to do this, as the wrapper doesn't provide a way to pass through the timeout value, to the best of my knowledge. One hackish option would be to monkeypatch the urlfetch API:

old_fetch = urlfetch.fetch
def new_fetch(url, payload=None, method=GET, headers={},
          allow_truncated=False, follow_redirects=True,
          deadline=10.0, *args, **kwargs):
  return old_fetch(url, payload, method, headers, allow_truncated,
                   follow_redirects, deadline, *args, **kwargs)
urlfetch.fetch = new_fetch

OTHER TIPS

I prefer this. It's more dynamic for GAE API updates.

# -*- coding: utf-8 -*-
from google.appengine.api import urlfetch

import settings


def fetch(*args, **kwargs):
    """
    Base fetch func with default deadline settings
    """
    fetch_kwargs = {
        'deadline': settings.URL_FETCH_DEADLINE
    }
    fetch_kwargs.update(kwargs)
    return urlfetch.fetch(
        *args, **fetch_kwargs
    )

You can set the default deadline which is the preferred way:

from google.appengine.api import urlfetch
import urllib, urllib2


class MyClass():

    def __init__(self):
        urlfetch.set_default_fetch_deadline(10)

I have an opener I use of the urllib2 for enabling the CookieJar, but you can then just do simple requests

response = self.opener.open(self.url_login, data_encoded)

You can easily see the effect if you set the deadline to 0.1

Have you tried setting the socket timeout value? Taken from here:

As of Python 2.3 you can specify how long a socket should wait for a response before timing out. This can be useful in applications which have to fetch web pages. By default the socket module has no timeout and can hang. Currently, the socket timeout is not exposed at the httplib or urllib2 levels. However, you can set the default timeout globally for all sockets using :

import socket
import urllib2

# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)

# this call to urllib2.urlopen now uses the default timeout
# we have set in the socket module
req = urllib2.Request('http://www.voidspace.org.uk')
response = urllib2.urlopen(req)

I'm not sure if GAE reads this value, but it's worth a shot!

Edit:

urllib2 has the ability to pass a timeout parameter:

The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This actually only works for HTTP, HTTPS, FTP and FTPS connections.connections.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top