Question

Im trying to remove all siblings given an element:

For example, given this etree object

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="C">
                    </letter>
                    <letter name="D">
                    </letter>
                    <letter name="G">
                    </letter>
                    <letter name="H">
                    </letter>
                    <letter name="I">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

I want to remove all G node siblings and return this:

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="G">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

Without using xpath or find, in an iterative way.

Can you give some tips on how to do it?

This is the code that i just write

import xml.etree.ElementTree as etree
data = """

<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="C">
                    </letter>
                    <letter name="D">
                    </letter>
                    <letter name="G">
                    </letter>
                    <letter name="H">
                    </letter>
                    <letter name="I">
                    </letter>
            </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>

"""
tree =etree.fromstring(data)


for parent in tree.getiterator():
    for child in parent:
        for subchild in child:
            if subchild.attrib.get('name') == "G":
                parent_name = child.attrib.get('name')
                #print parent_name

for parent in tree.getiterator():
    if parent.attrib.get('name') == parent_name:
        for child in parent:
            if child.attrib.get('name') == "G":
                print "not this"
            else:
                parent.remove(child)


print etree.tostring(tree)

Cheers!

Was it helpful?

Solution

You are close. You will need to reiterate through any element that contains the name G once you find the name G. So you will want to use something more along these lines (which uses iteration rather than xpath or find, per your requirements):

>>> def remove(name, value, root):
    """
    Iterates through the @root element and removes elements
    where the @name != @value.
    """
    for element in root:
        if element.attrib.get(name) != value:
            root.remove(element)


>>> def remove_siblings_of(name, value, root):
    """
    Recursively removes from the @root element all elements which (1) do
    not have @name == @value but (2) do have a sibling where @name == @value.
    """
    for element in root:
        if element.attrib.get(name) == value:
            remove(name, value, root)  # need to reiterate through element now to remove previous siblings
        if len(element):
            remove_siblings_of(name, value, element)
    return root

When you use the latter function on your xml, you will get the result you are looking for:

>>> siblings_removed = remove_siblings_of('name', 'G', root)
>>> print et.tostring(siblings_removed)
<xml>
    <letter name="A">
            <letter name="B">
                    <letter name="G">
                    </letter>
                    </letter>
            <letter name="E">
                <letter name="F">
                </letter>
            </letter>
    </letter>
</xml>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top