Parsing evernote xml with element tree on python: how to find/delete elements with specific tag name

https://stackoverflow.com/questions/23331309

10-07-2023
|

Question

I would want to get all text connected with en-todo elements from evernote note on xml. Is there any function to get all tags with specific name from root by one line?

I have tried root.findall("en-todo"), but it returns nothings.

Typical evernote note can contain en-todo elements with different levels of nesting. En-todo tags are often inside span tag, although sometimes the whole stack en-todo and text can be inside that span tag. It is making parsing quite complicated.
Is there any simple way to delete all span tags from xml with keeping its children? Can you suggest any other solutions?

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE en-note SYSTEM
"http://xml.evernote.com/pub/enml2.dtd">
<en-note>
<div>
<en-todo></en-todo>
<span>one task</span>
</div>

<div><span><br clear="none"/></span></div>

<div><span>
<en-todo></en-todo>
<span>second task</span>
<br clear="none"/>
<span><en-todo></en-todo>third task</span>
<br clear="none"/>
<span><en-todo></en-todo>forth task</span>
<br clear="none"/>
<span><en-todo></en-todo>fifth task</span>
</span></div>
...

Solution

Get it by xpath:

root.findall('.//en-todo')

This will find you all en-todo elements at any level of nesting.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow