Question

I have an old Java program that used to get data from an html page, worked fines few years ago, now when I run it, there is no data. The page link is :

http://www.batstrading.com/book/ibm/

I can still see the html table got from my Java program, but there is no data, but if you use a browser to get to that page, you can see data dynamically changing, why ?

The html text I now get with my Java program from the page is like the text you can see from the browser's view source, looks like this :

    <tbody>
      <tr>
        <td class="shares">&nbsp;</td>
        <td class="price">&nbsp;</td>
      </tr>

Instead of data, it is showing &nbsp;

How to fix my code to get the data ? What I mean is : there is nothing wrong with the Java program, it's getting the text just like the browser's view source, you don't see the data, because the page is now dynamic, so how to use Java to get data from a dynamic page is the question.

Was it helpful?

Solution

Scrap the current approach since the site is updated via Javascript. You won't be able to just download the HTML and make it work.

However, a much easier approach (than using Selenium or a JS engine) would be to simply request the source data that the Javascript is using to update the page:

http://www.batstrading.com/json/bzx/book/IBM

It's perfectly valid JSON. Request that link with your HTTP client and parse the JSON using Jackson. This will yield very reliable results.

Disclaimer You need to make sure that what you are doing complies with the Terms of Service on the website you are using. Otherwise you subject yourself to legal issues.

OTHER TIPS

You can't do this by directly downloading the page, you've got two options here. Personally I would use Casperjs or Selenium to interact with the javascript on the page. Otherwise you have to manually simulate what the javascript is doing, which is in general not very long-lasting or scalable (read: it will break once they change anything about their site).

These tools will emulate a browser and let you wait until certain elements load.

There are a number of other of these kinds of web browsers, but I would highly recommend Casper since it's fast and easy to use and call even from within your Java script since it's just Javascript. See this for instructions on calling javascript from java.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top