Xpath: how to get the text of <a> tag inside a <p> tag

https://stackoverflow.com/questions/23276348

html
python
web-scraping
xpath
scrapy

09-07-2023
|

Question

I have the following issue when trying to get information from some website using scrapy.

I'm trying to get all the text inside <p> tag, but my problem is that in some cases inside those tags there is not just text, but sometimes also an <a> tag, and my code stops collecting the text when it reaches that tag.

This is my Xpath expression, it's working properly when there aren't tags contained inside:

description = descriptionpath.xpath("span[@itemprop='description']/p/text()").extract()

Solution

Posting Pawel Miech's comment as an answer as it appears his comment has helped many of us thus far and contains the right answer:

Tack //text() on the end of the xpath to specify that text should be recursively extracted.

So your xpath would appear like this:

span[@itemprop='description']/p//text()

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow