I'm trying to use the requests module in Python to handle a cgi and can't work out what I've done wrong.

I've tried to use Google Dev Tools in Chrome to provide the right params and data but I've not quite fixed it.

The site I'm trying to get data from is: http://staffordshirebmd.org.uk/cgi/birthind.cgi

Here's my code

import requests 

headers = {"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Encoding":"gzip,deflate,sdch",
        "Accept-Language":"en-US,en;q=0.8",
        "Cache-Control":"no-cache",
        "Connection":"keep-alive",
        "Content-Length":"124",
        "Content-Type":"application/x-www-form-urlencoded",
        "DNT":"1",
        "Host":"staffordshirebmd.org.uk",
        "Origin":"http://staffordshirebmd.org.uk",
        "Pragma":"no-cache",
        "Referer":"http://staffordshirebmd.org.uk/cgi/birthind.cgi?county=staffordshire",
        "User-Agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25"}

payload = {"county":"staffordshire",
          "lang": "",
          "year_date":"1837",
          "search_region":"All",
          "sort_by":"alpha",
          "csv_or_list":"screen",
          "letter":"A",
          "submit":"Display Indexes"}

f = requests.put(path, data=payload, headers=headers)

f.text

This provides the response:

u'<html>\n<body>\n<div>\n<p>\nThe Bookmark you have used to reach this page is not valid.\n</p>\n<p>\nPlease click <a href="http://staffordshirebmd.org.uk/">here</a> to return to the main page and reset your\nbookmark to that page.\n</p>\n</div>\n</body>\n</html>\n\n'

What am I doing wrong?

有帮助吗?

解决方案

The URL you used in your Referrer header has a form that uses POST, not PUT.

You rarely should hardcode the Content-Length header; do leave that up to requests to calculate for you and set. Different browsers can easily use subtly different content lengths, a script that only works with a fixed Content-Length would not last long.

Removing the Content-Length header and changing .put() to .post() gives me results, in any case.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top