Question

I have a url to fetch, that gives a HTTP 303 redirect :

import urllib2 as web
import sys

url='http://sample.com'

try: handle=web.urlopen(url)
except web.HTTPError, e:
  print e.code
  sys.exit(1)
data=handle.read()
print 'Result :'
print data

So, the above code prints 303 as a result, its a 303 redirect.

I want that it should follow the redirect and fetch me the HTML of the destination.

Please help..

Edit :

curl -I http://my303redirecturl.com/

HTTP/1.1 303 See Other
Date: Tue, 23 Aug 2011 04:53:53 IST
Server: Mule Core/3.1.2
Expires: Tue, 23 Aug 2011 04:53:53 IST
http.status: 303
Content-Type: application/json
MULE_ENCODING: UTF-8
Content-Length: 0
Connection: close

Will this help ?

Was it helpful?

Solution

urllib2 should follow 303 redirects by default. Use the following example to test:

import urllib2
url = 'http://phihag.de/2011/so/303/'
print(urllib2.urlopen(url).read())

If the above code prints out the content of example.net, but your URL doesn't, the URL in question is not resolving to a correct 303 redirect. If that is the case, you can use urllib2.build_opener to get an opener that uses your own implementation of BaseHandler instead of the default HTTPRedirectHandler.

OTHER TIPS

This page provides a pretty good summary on how to handle HTTP redirects with urllib.

HTH

EDIT: The article shows how to retrieve the redirection URL, which can then be requested with another urlopen.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top