Question

In one of the answers that I have received here, I encountered a problem of not knowing how to pass automatically through "Google App Engines" my ID and a password to a website, on which I am a registered user and have an account. A suggestion was given to me to "check for an HTTP status code of 401, "authorization required", and provide the kind of HTTP authorization (basic, digest, whatever) that the site is asking for". I don't know how to check for status code. Can anyone, please, tell me how to do it?

+++++++++++++++++++++++++++++++++

Additional Information:

If I use this way in Google App Engine (fetching the url of my eBay summary page):

from google.appengine.api import urlfetch
url = "http://my.ebay.com/ws/eBayISAPI.dll?MyEbay&gbh=1&CurrentPage=MyeBaySummary&ssPageName=STRK:ME:LNLK"
result = urlfetch.fetch(url)
if result.status_code == 200:
   print "content-type: text/plain"
   print
   print result.status_code

I always get "200" instead of "401"

Was it helpful?

Solution

In ordinary Python code, I'd probably use the lower-level httplib, e.g.:

import httplib

domains = 'google.com gmail.com appspot.com'.split()

for domain in domains:
  conn = httplib.HTTPConnection(domain)
  conn.request('GET', '/')
  resp = conn.getresponse()
  print 'Code %r from %r' % (resp.status, domain)

this will show you such codes as 301 (moved permanently) and 302 (moved temporarily); higher level libraries such as urllib2 would handle such things "behind the scenes" for you, which is handy but makes it harder for you to take control with simplicity (you'd have to install your own "url opener" objects, etc).

In App Engine, you're probably better off using urlfetch, which returns a response object with a status_code attribute. If that attribute is 401, it means that you need to repeat the fetch with the appropriate kind of authorization information in the headers.

However, App Engine now also supports urllib2, so if you're comfortable with using this higher level of abstraction you could delegate the work to it. See here for a tutorial on how to delegate basic authentication to urllib2, and here for a more general tutorial on how basic authentication works (I believe that understanding what's going on at the lower layer of abstraction helps you even if you're using the higher layer!-).

OTHER TIPS

Unless I don't understand fully your question, you can grab the return code from the Response Object using the status_code property.

First, you'll have to issue a fetch() to the URL you want to test.

Most user-oriented sites don't use HTTP authentication, preferring instead to use cookie-based authentication, with HTML forms for signin. If you want to duplicate this in your own code, you need to make an HTTP POST request to the login URL for the application in question, and capture the cookie that's sent back, including that in all your future requests to authenticate yourself. Without more details about the specific site you're trying to authenticate against, it's difficult to be more specific.

You are not getting 401 because that site is not returning 401 but 200 always. Usually type of coding we do for websites is return 200 with a page saying "Please login..blah blah", if site returned anything other then 200 browser will not display the funky error msg.

So in short as i mentioned in other question, you need to look into login page, see what params it uses e.g login=xxx, password=yyy, post it to that page and you will have to manage the cookies too, that is where library like twill etc come into picture.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top