PyQuery: Get only text of element, not text of child elements

Question

I have the following HTML:

<h1 class="price">
 <span class="strike">$325.00</span>$295.00
</h1>

I'd like to get the $295 out. However, if I simply use PyQuery as follows:

price = pq('h1').text()

I get both prices.

Extracting only direct child text for an element in jQuery looks reasonably complicated - is there a way to do it at all in PyQuery?

Currently I'm extracting the first price separately, then using replace to remove it from the text, which is a bit fiddly.

Thanks for your help.

Solution

I don't think there is an clean way to do that. At least I've found this solution:

>>> print doc('h1').html(doc('h1')('span').outerHtml())
<h1 class="price"><span class="strike">$325.00</span></h1>

You can use .text() instead of .outerHtml() if you don't want to keep the span tag.

Removing the first one is much more easy:

>>> print doc('h1').remove('span')
<h1 class="price">
  $295.00
</h1>

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow