Question

I am trying to fetch data from a webpage using urllib2. The page is visible on the browser but through the script I keep getting HTTPError: HTTP Error 403: Forbidden

I also tried mimicking a browser request by changing the user-agent string but no success.

Any ideas on this?

Was it helpful?

Solution

I tried with tamper data and firefox to send only user agent, and I get 403. Try to add other headers:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive

I tried, and this should work.

OTHER TIPS

The site is checking your User-Agent just set it to Internet Explorer:

request.add_header('User-Agent', 'Internet Explorer')

I confirmed that this works with wget, and you get 403 unless you set your user agent to Internet Explorer.

:) Am trying to get quotes from NSE too ! like pythonFoo says you need additional headers. Hower only Accept is sufficient. The user-agent can say python ( stay true ! )

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top