I looked here and here for information on my issue, but with no luck.
I made some python code that is intended to grab a webpage's source, as in Safari's Web Inspector. However, I have been getting different code from my application and Safari's Web Inspector. Here is my code so far:
#!/usr/bin/python
import urllib2
# headers
hdr = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Cache-Control': 'max-age=0'}
# request data
req = urllib2.Request("https://www.google.com/#q=rainbow&safe=active", headers=hdr)
# try to get data
try:
page = urllib2.urlopen(req)
print page.info()
except urllib2.HTTPError, e:
print e.fp.read()
content = page.read()
#print content
print content
And the headers match up to what is in Web Inspector:
The code returned is different, though, for a google search for "rainbow".
My python:
http://paste.ubuntu.com/6270549/
Web Inspector:
http://paste.ubuntu.com/6270606/
As far as I know, it seems that my code is missing a large number of the ubiquitous }catch(e){gbar_._DumpException(e)}
lines that are present in the Web Inspector code. Also, my code only has 78 lines, while the Web Inspector code has 235 lines. Does this mean that my code is not getting all of the javascript or some other portion of the webpage? How can I get my code retrieve the same data as the Web Inspector?