Question

I am trying to capture http status code 3XX/302 for a redirection url. But I cannot get it because it gives 200 status code.

Here is the code:

import requests
r = requests.get('http://goo.gl/NZek5')
print r.status_code

I suppose this should issue either 301 or 302 because it redirects to another page. I had tried few redirecting urls (for e.g. http://fb.com ) but again it is issuing the 200. What should be done to capture the redirection code properly?

Was it helpful?

Solution

requests handles redirects for you, see redirection and history.

Set allow_redirects=False if you don't want requests to handle redirections, or you can inspect the redirection responses contained in the r.history list.

Demo:

>>> import requests
>>> url = 'https://httpbin.org/redirect-to'
>>> params = {"status_code": 301, "url": "https://stackoverflow.com/q/22150023"}
>>> r = requests.get(url, params=params)
>>> r.history
[<Response [301]>, <Response [302]>]
>>> r.history[0].status_code
301
>>> r.history[0].headers['Location']
'https://stackoverflow.com/q/22150023'
>>> r.url
'https://stackoverflow.com/questions/22150023/http-redirection-code-3xx-in-python-requests'
>>> r = requests.get(url, params=params, allow_redirects=False)
>>> r.status_code
301
>>> r.url
'https://httpbin.org/redirect-to?status_code=301&url=https%3A%2F%2Fstackoverflow.com%2Fq%2F22150023'

So if allow_redirects is True, the redirects have been followed and the final response returned is the final page after following redirects. If allow_redirects is False, the first response is returned, even if it is a redirect.

OTHER TIPS

requests.get allows for an optional keyword argument allow_redirects which defaults to True. Setting allow_redirects to False will disable automatically following redirects, as follows:

In [1]: import requests
In [2]: r = requests.get('http://goo.gl/NZek5', allow_redirects=False)
In [3]: print r.status_code
301

This solution will identify the redirect and display the history of redirects, and it will handle common errors. This will ask you for your URL in the console.

import requests

def init():
    console = input("Type the URL: ")
    get_status_code_from_request_url(console)


def get_status_code_from_request_url(url, do_restart=True):
    try:
        r = requests.get(url)
        if len(r.history) < 1:
            print("Status Code: " + str(r.status_code))
        else:
            print("Status Code: 301. Below are the redirects")
            h = r.history
            i = 0
            for resp in h:
                print("  " + str(i) + " - URL " + resp.url + " \n")
                i += 1
        if do_restart:
            init()
    except requests.exceptions.MissingSchema:
        print("You forgot the protocol. http://, https://, ftp://")
    except requests.exceptions.ConnectionError:
        print("Sorry, but I couldn't connect. There was a connection problem.")
    except requests.exceptions.Timeout:
        print("Sorry, but I couldn't connect. I timed out.")
    except requests.exceptions.TooManyRedirects:
        print("There were too many redirects.  I can't count that high.")


init()

Anyone have the php version of this code?

    r = requests.get(url)
    if len(r.history) < 1:
        print("Status Code: " + str(r.status_code))
    else:
        print("Status Code: 301. Below are the redirects")
        h = r.history
        i = 0
        for resp in h:
            print("  " + str(i) + " - URL " + resp.url + " \n")
            i += 1
    if do_restart:
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top