Question

I've been reading about Python's urllib2's ability to open and read directories that are password protected, but even after looking at examples in the docs, and here on StackOverflow, I can't get my script to work.

import urllib2
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm=None,
                    uri='https://webfiles.duke.edu/',
                    user='someUserName',
                    passwd='thisIsntMyRealPassword')
opener = urllib2.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib2.install_opener(opener)
socks = urllib2.urlopen('https://webfiles.duke.edu/?path=/afs/acpub/users/a')
print socks.read()
socks.close()

When I print the contents, it prints the contents of the login screen that the url I'm trying to open will redirect you to. Anyone know why this is?

Was it helpful?

Solution

auth_handler is only for basic HTTP authentication. The site here contains a HTML form, so you'll need to submit your username/password as POST data.

I recommend you using the mechanize module that will simplify the login for you.

Quick example:

import mechanize

browser = mechanize.Browser()

browser.open('https://webfiles.duke.edu/?path=/afs/acpub/users/a')

browser.select_form(nr=0)

browser.form['user'] = 'username'
browser.form['pass'] = 'password'
req = browser.submit()

print req.read()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top