문제

I want to scrape some pages of this site: Marketbook.ca So I used for that mechanize. but it does not load pages properly. and it returns a page with empty body, like in the following code:

require 'mechanize'
agent = Mechanize.new
agent.user_agent_alias = 'Linux Firefox'
agent.get('http://www.marketbook.ca/list/list.aspx?ETID=1&catid=1001&LP=MAT&units=imperial')

What could be the issue here?

도움이 되었습니까?

해결책

Actually this page requires JS engine to display the content:

<noscript>Please enable JavaScript to view the page content.</noscript>

Mechanize doesn't handle pages with JS, so you'd better choose another options like Selenium or WATIR. Both need a real web browser to manipulate.

Another option for you is to look through included JS scripts and figure out where data comes from and query that web resource if it's possible.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top