Question

I am trying to parse Octane benchmark page http://octane-benchmark.googlecode.com/svn/latest/index.html , with WebElements:

<div class="hero-unit" id="inside-anchor">
    <h1 id="main-banner" align="center">Start Octane 2.0</h1>
    <div id="bar-appendix"></div>
</div>

I've started Selenium WebDriver on my tablet device (using Java, Eclipse, Selendoroid)

SelendroidConfiguration config = new SelendroidConfiguration();
selendroidServer = new SelendroidLauncher(config);
selendroidServer.lauchSelendroid();
DesiredCapabilities caps = SelendroidCapabilities.android();
driver = new SelendroidDriver(caps); 

and I've initialized driver with Octane page:

driver.get("http://octane-benchmark.googlecode.com/svn/latest/index.html");

I am trying to parse it with xpath:

String xpathString = "//div[@class='hero-unit']//h1";   
String line = driver.findElement(By.xpath(xpathString)).getText();
System.out.println(line);

but Java returns NullPointer Exception (on line)- function FindElement() can not find anything on this .html page.

Driver is started well, it returns appropriate value for getCurrentUrl() function, but can not return PageSource(), and can not return any value for findElement(By.something...). Looks like, this Octane page has something that stops every search request (during parsing process). On the same way I have parsed 7 other benchmark pages, and they worked well, but this Octane page...acts just like it is "empty" for WebDriver...

I don't know is it because of

<script type="text/javascript"> 

part, or something else?

Is this Octane benchmark page special about something?

Thanks...

Was it helpful?

Solution

xPath() works with sites that conform to XML standard.HTML is more forgiving; you can have missing end tags and other errors but in XML this is forbidden. So chances were that the html does not conform with the XML standard so I double checked by validating your link at this site:

http://www.w3schools.com/xml/xml_validator.asp

And guess what? It had some errors. You save yourself the trouble next time by validating on this site first. Of course, that doesn't mean that XML conforming sites are all suitable for xPath() webscraping(hidden elements, javascript, etc.). However, from the nature of the reported error you might be able to tell which is not.

OTHER TIPS

The By.xpath() only works if the html page conforms to XML standards. Probably the Octane 2.0 page does not comply and hence the method returns null.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top