Question

I'd like to extract a set of numbers from a list. The list items are identified by headings within them, and appear in a random order. I'd like to assign (in the first list item, for example) the number (1,287,498) to the heading contained in the same <li> tag (Russia).

<ul>

<li> 
<h6>Russia</h6> 
<p>Red</p> 
<p>1,287,498</p>
</li>

<li> 
<h6>USA</h6> 
<p>Blue</p>
<p>782,2378,223</p>
</li>

...etc.

</ul>

I want something like this:

russia = 1,287,498

I tried using XPath, but it didn't work, presumably because the order is random; it kept assigning wrong variables.

Was it helpful?

Solution

lis = doc.xpath('//li') # Or however you are getting the right elements

data = Hash[ lis.map{ |li| [li.at('h6').text, li.search('p').last.text] }]
#=> {
#=>   "Russia" => "1,287,498",
#=>   "USA"    => "782,2378,223",
#=> }

p data["Russia"]
#=> "1,287,498"

This answer assumes that the <p> with the number is the last paragraph in the <li>. It's hard to know if this is correct as you have not shared a full document.

OTHER TIPS

You should probably be using the CSS sibling selectors ~ or +:

doc.at('h6[text()=Russia] ~ p:last').text
#=> "1,287,498"
doc.at('h6[text()=Russia] + p + p').text
#=> "1,287,498"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top