Nokogiri: Finding all tags in a direct path, not including arbitrary levels of nesting

StackOverflow https://stackoverflow.com/questions/23622213

  •  21-07-2023
  •  | 
  •  

Pergunta

Say I have an html document like:

<div id='findMe'>
  <table>
    <tr>
      <td>
        <p>
          <a href="bad">bad</a>
        </p>
      </td>
    </tr>
  </table>
  <p>
    This is some text and this is a <a href="good">link</a>
  </p>
</div>

I want to capture all links instead the div #findMe, inside paragraphs tags, but not inside table or any other tags. So, I want the one labeled "good", but not the one labeled "bad". I'm trying:

Nokogiri::HTML(html).css('#findMe p a')

but that's capturing both links. I also tried a more explicit xpath:

Nokogiri::HTML(html).css('#findMe').xpath('//p/a')

But that's doing the same thing. How can I tell Nokogiri to only search a specific path down the tree?

Foi útil?

Solução

Use > in CSS to select immediate descendant.

Nokogiri::HTML(html).css('#findMe > p > a')

Or use / in xpath:

Nokogiri::HTML(html).xpath("//div[@id='findMe']/p/a")

Outras dicas

Figured out a way to do it, but I'm still not too comfortable with xpaths so if this isn't the best way feel free to post the more canonical way to achieve this.

Nokogiri::HTML(html).css(#findMe').xpath('//div/p/a')
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top