Question

I'm having trouble using xpath in Rapidminer. Below is a sample html that I'm trying to pull data from. I'm having trouble getting the number 7001 and Calfornia.

I use //h:span[@class='detail-block']//h:/text() and I can get "Number:" Then I try //h:span[@class='detail-block']/span//h:/text() and get nothing. I tried a bunch of variation of this and still come up with nothing. I'm able to get things to work on google spreadsheet =importXML, but not rapidminer.

<div class="information">
<h2 class="underline">Information</h2>
<span class="detail-block"><span class="detail-attribute">Number:&nbsp;</span>         
<span>7001</span></span>
<span class="detail-block"><span class="detail-attribute">Location:&nbsp;</span> <span>California</span></span>
Was it helpful?

Solution

I do not see why your "working" example (//h:span[@class='detail-block']//h:/text()) should do. The h: is an namespace prefix. hand has to be followed by an node or an attribute.

//h:span[@class='detail-block']//text() will return any dependent text nodes to span[@class='detail-block']: Number: 7001 Location: California

For "Number:" use:
//h:span[@class='detail-block'][1]/h:span[1]/text()

For "7001 //h:span[@class='detail-block'][1]/h:span[2]//text()

And for "California"

//h:span[@class='detail-block'][2]/h:span[2]//text()

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top