Question

I need to get the HTML source of pinnaclesports.com. The problem is it detects whether cookies and JS are enabled and if not, it just returns some page saying

This site requires JavaScript and Cookies to be enabled. Please change your browser settings or upgrade your browser.

Is there any way how to spoof JS support when using cURL?

EDIT: I can use a headless browser that runs either as a Perl/Ruby module or is written in PHP

Was it helpful?

Solution

I figured out that, if you make cookie-less REQUEST a page will be returned , which uses javascript to set cookies, the one which you are getting using the curl.

make another curl call like this

curl https://www.pinnaclesports.com/ --cookie "YPF8827340282Jdskjhfiw_928937459182JAX666=122.167.231.139"

i.e. You have to make 2 calls 1) make cookie less call, read and regex to find cookiename. 2) make 2nd request after setting the cokie name. that will solve your problem.

OR
Just use YQL

select * from html where url="https://www.pinnaclesports.com/" 

point your curl to here

OTHER TIPS

Other sugestion is set the user agent, this solution works for me on parser of the Google Groups:

curl -L -v "https://groups.google.com/d/forum/<GROUP-NAME>" -A "Mozilla/5.0 (compatible;  MSIE 7.01; Windows NT 5.0)"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top