Question

I want to scrape some pages of this site: Marketbook.ca So I used for that mechanize. but it does not load pages properly. and it returns a page with empty body, like in the following code:

require 'mechanize'
agent = Mechanize.new
agent.user_agent_alias = 'Linux Firefox'
agent.get('http://www.marketbook.ca/list/list.aspx?ETID=1&catid=1001&LP=MAT&units=imperial')

What could be the issue here?

Was it helpful?

Solution

Actually this page requires JS engine to display the content:

<noscript>Please enable JavaScript to view the page content.</noscript>

Mechanize doesn't handle pages with JS, so you'd better choose another options like Selenium or WATIR. Both need a real web browser to manipulate.

Another option for you is to look through included JS scripts and figure out where data comes from and query that web resource if it's possible.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top