Question

I would like to generate HTML Snapshots using Watir, hosted on Heroku.

Google's Full Specification for Making AJAX Applications Crawlable suggests using HTMLUnit... see How do I create an HTML snapshot? point #3.

HtmlUnit is a Java-only headless browser emulator; and unfortunately jRuby is not an option on Heroku. So HtmlUnit is ruled out (to my knowledge).

If you're interested I have another question open regarding HtmlUnit as a service hosted on Google App Engine... Making AJAX Applications Crawlable? How to build a simple web service on Google App Engine to produce HTML Snapshots? ... still waiting on a proven example/answer.

Was it helpful?

Solution

No. You need a full desktop environment to run watir. Heroku doesn't provide you with that.

You could use a service such as Amazon EC2

OTHER TIPS

Yes you can

Use Watir with PhantomJS, which is headless

browser = Watir::Browser.new :phantomjs

To use PhantomJS on Heroku, you'll need to use a Heroku PhantomJS buildpack

Troelskin's answer is incorrect. There are ways to run "headless" browsers with Watir, which do not require a "full desktop environment". Having said that, I do not know which method may be appropriate on Heroku.

Other "headless" automation options (if you are using Ruby) are Mechanize with Open-Uri, along with (optional) Nokogiri.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top