cURL request on a page requiring JavaScript support

https://stackoverflow.com/questions/12303134

30-06-2021
|

题

I need to get the HTML source of pinnaclesports.com. The problem is it detects whether cookies and JS are enabled and if not, it just returns some page saying

This site requires JavaScript and Cookies to be enabled. Please change your browser settings or upgrade your browser.

Is there any way how to spoof JS support when using cURL?

EDIT: I can use a headless browser that runs either as a Perl/Ruby module or is written in PHP

解决方案

I figured out that, if you make cookie-less REQUEST a page will be returned , which uses javascript to set cookies, the one which you are getting using the curl.

make another curl call like this

curl https://www.pinnaclesports.com/ --cookie "YPF8827340282Jdskjhfiw_928937459182JAX666=122.167.231.139"

i.e. You have to make 2 calls 1) make cookie less call, read and regex to find cookiename. 2) make 2nd request after setting the cokie name. that will solve your problem.

OR
Just use YQL

select * from html where url="https://www.pinnaclesports.com/"

point your curl to here

其他提示

Other sugestion is set the user agent, this solution works for me on parser of the Google Groups:

curl -L -v "https://groups.google.com/d/forum/<GROUP-NAME>" -A "Mozilla/5.0 (compatible;  MSIE 7.01; Windows NT 5.0)"

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow