If I’ve understood you correctly, you have some HTML that looks something like this:
<div>
This is the div we want.
<span class="test1">Span contents</span> Other contents
</div>
<div>
We don't want this div.
<span class="something else">Not this</span> one
</div>
and you want to select the first div
, but not the second.
This isn’t possible with CSS (and as far as I can tell isn’t possible with any of the CSS extensions Nokogiri implements), but can be done using XPath.
A simple XPath query that would select the div
we want could look like this:
//div[span[@class = 'test1']]
This can be read as “all div
elements that have span
elements as direct children that have class
attributes with the value test1
”.
This query only tests the class
attribute for a direct match against test1
, so it won’t match if the class is something like “test1 otherclass”
. To get it to work like CSS, you need to change the test to something like:
[contains(concat(' ', normalize-space(@class), ' '), ' test1 ')]
Additionally the original query only selects span
s that are direct children of the div
. If you have span
inside other elements that you want to match, you will need to use the descendant
axis in your query.
Putting it all together:
//div[descendant::span[contains(concat(' ', normalize-space(@class), ' '), ' test1 ')]]
Which can be read as “all div
elements that have a span
descendant that are in the test1
class (in the CSS sense)”.
Obviously to use this you need to use the xpath
method not the css
method:
divs = @page.xpath("//div[descendant::span[contains(concat(' ', normalize-space(@class), ' '), ' test1 ')]]")