Вопрос

I have a stock trading website that is only accessible after logging into the site. After logging in, there is a stock value that I am trying to extract. That number is not readily available and takes a while to load as it is being updated from the company's database.

I am trying to write a script in Ruby that will allow me to extract the number and then use it in my program.

In firebug, the tag looks like this but only after the number has loaded:

<span id="ContentPlaceHolderTodaysStock">10,747</span>

I have explored libraries such as hpricot and nokogiri and have tried code similar to the following:

require "nokogiri"
require "open-uri"
doc = Nokogiri::HTML(open("website.com/stocks"))
puts doc.xpath("//span/text()")

The problems I run into are 1)it only reads the html from the login page "website.com" instead of "website.com/stocks" 2)once I do get past the login, how do I use the html code after the javascript has loaded?

I have also tried Watir so that can get me past problem #1 but then doing something like the following doesn't help with problem#2 because it provides the original html source...

require 'net/http'
source = Net::HTTP.get("website.com/stocks", '/')

Any help in solving this problem would be greatly appreciated. Thank you!

Это было полезно?

Решение

Since you are able to login using Watir, you may as well use it to get the text off of the page. Watir has built-in methods for waiting for asynchronous components to load - see http://watirwebdriver.com/waiting/.

To get the text, you will want something like:

puts browser.span(:id => 'element_id').when_present.text

Другие советы

If it's being loaded after-the-fact, it can't be seen by Nokogiri. You'll need to use something like Watir.


once I do get past the login, how do I use the html code after the javascript has loaded?

You can't get there with Nokogiri. The added HTML doesn't exist in Nokogiri's world, since it's given the base HTML via OpenURI. Nokogiri doesn't execute JavaScript.

Watir, on the other hand, can do all that, so it's your only choice. You'll have to figure out how to navigate through the login-page, request the stock page, then loop, waiting until the text appears, then grab it and do whatever you want with it.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top