Frage

If I have an xml file like this:

<root>
  <item>
    <prop>something</prop>
  </item>
  <test>
    <prop>something</prop>
  </test>
  <test2>
    <prop>something</prop>
  </test2>
</root>

I can use xmlTree.getroot().findall("item") to get all of the 'item' elements.

How would I get all of the 'item' OR 'test' elements? I want something like:

xmlTree.getroot().findall("item or test")

I didn't see anything like this in the examples in the documentation. Any ideas?

War es hilfreich?

Lösung

Since ElementTree from stdlib provides only limited xpath support, you can use | xpath OR operator only if you are using lxml:

from lxml import etree as ET


data = """<?xml version="1.0"?>
<data>
<item>1</item>
<test>2</test>
</data>"""

tree = ET.fromstring(data)

for element in tree.xpath('//item|//test'):
    print element.text

prints:

1
2

In case of xml.etree.ElementTree you can combine the results of two separate findall() calls:

for element in tree.findall('.//item') + tree.findall('.//test'):
    print element.text

Or, check the tag name inside the loop:

for element in tree.iter():
    if element.tag in ('item', 'test'):
        print element.text

Andere Tipps

A "wild-card" solution for large data-set

Here is a solution where you do not need to specify "A | B| ...". Instead use "*" as a wild card, and filter out unwanted parts by index as shown below in the code (for example, in this question the last tag "test2" can be excluded by using lst[:-1]).

import xml.etree.ElementTree as ET
data='''
<root>
  <item>
    <prop>something1</prop>
  </item>
  <test>
    <prop>something2</prop>
  </test>
  <test2>
    <prop>something3</prop>
  </test2>
</root>'''
root = ET.fromstring(data)
lst = root.findall('*')
for x in lst[:-1]:
    print(x.find('prop').text)

OUTPUT:

something1

something2

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top