Question

i want to get information from bugzilla(bugzilla.mozilla.org)

when i wrote code like below,

#
import httplib
host = 'bugzilla.mozilla.org'

h = httplib.HTTPSConnection(host)
h.putrequest('GET', 'https://bugzilla.mozilla.org/index.cgi')
h.putheader('Accept', 'application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, */*')
h.putheader('User-Agent', "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)")
h.putheader('Host', host)
h.putheader('Connection', 'Keep-Alive')
h.endheaders()

response = h.getresponse()
print response.read()

the server always return

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://bugzilla.mozilla.org/index.cgi">here</a>.</p>
</body></html>

but this code works fine when other https servers. does anybody knows that where am i wrong ?

Was it helpful?

Solution

httplib doesn't follow redirects (301 http code), you could use urrlib2 instead:

from urllib2 import Request, urlopen

req = Request('https://bugzilla.mozilla.org/index.cgi')
req.add_header('Accept', 'application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, */*')
req.add_header('User-Agent', "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.3)")
response = urlopen(req) #NOTE: it doesn't check server's ssl certificate
print(response.headers)
content = response.read()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top