After tons of reaserch I managed to find out the reason behind this.
Find of all the site uses a so called NTLM authentication, which is not supported by mechanize. This can help to find out the authentication mechanism of a site:
wget -O /dev/null -S http://www.the-site.com/
So the code was modified a little bit:
import sys
import urllib2
import mechanize
from ntlm import HTTPNtlmAuthHandler
print("LOGIN...")
user = sys.argv[1]
password = sys.argv[2]
url = sys.argv[3]
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
browser = mechanize.Browser()
handlersToKeep = []
for handler in browser.handlers:
if not isinstance(handler,
(mechanize._http.HTTPRobotRulesProcessor)):
handlersToKeep.append(handler)
browser.handlers = handlersToKeep
browser.add_handler(auth_NTLM)
response = browser.open(url)
response = browser.open("http://www.the-site.com")
print(response.read())
and finally mechanize needs to be patched, as mentioned here:
--- _response.py.old 2013-02-06 11:14:33.208385467 +0100
+++ _response.py 2013-02-06 11:21:41.884081708 +0100
@@ -350,8 +350,13 @@
self.fileno = self.fp.fileno
else:
self.fileno = lambda: None
- self.__iter__ = self.fp.__iter__
- self.next = self.fp.next
+
+ if hasattr(self.fp, "__iter__"):
+ self.__iter__ = self.fp.__iter__
+ self.next = self.fp.next
+ else:
+ self.__iter__ = lambda self: self
+ self.next = lambda self: self.fp.readline()
def __repr__(self):
return '<%s at %s whose fp = %r>' % (