Question

I'm trying to use ElementTree's findall() function to get a list of all <planet> elements with a name subelement <name>Kepler</name>. For example, I want only the first two planets returned in the following xml file:

<planet>
    <name>Kepler</name>
</planet>
<planet>
    <name>Kepler</name>
</planet>
<planet>
    <name>Newton</name>
</planet>

What's an elegant way to do this (other than finding all <planet> elements and looping over them)? I was hoping for something like

root.findall(".//planet/name[text()=='Kepler']")

Any hints?

Was it helpful?

Solution

Close! In xpath the following is valid (tested in lxml to make sure!)

root.xpath('//planet[name[text()="Kepler"]]')

which is equivalently written:

root.xpath('//planet[name="Kepler"]')

Now, xml.etree doesn't seem to like the former XPath expression (Invalid Predicate?!) but it's cool with the latter. Oh well. So then we have:

root.findall('.//planet[name="Kepler"]')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top