Carving elements from XML (xmlstarlet,awk,perl..)

Question

I'm trying to carve out sections from hundreds of XML files. The structure of the XML docs is similar to:

<document>
<nodes>
<node id=123>pages of txt</node>
<node id-=124>more example pages of txt and sub elements</node>
</nodes></document>

I'm just trying to extract all <node> elements. I have been trying to use xmlstarlet:

xmlstarlet sel -t -c “/document/nodes”

The problem is that it only returns </nodes>.

I just need to extract the following examples:

<node id=123>pages of txt</node>
<node id-=124>more example pages of txt and sub elements</node>

Can anyone recommend a better option, tool or approach? Many thanks.

Solution

You just have your xpath wrong:

xmlstarlet sel -t -c '//node'

Also, valid XML required all attribute values to be quoted

<document>
<nodes>
<node id="123">pages of txt</node>
<node id="124">more example pages of txt and sub elements</node>
</nodes></document>

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow