Question

I want to transform an XML document which I have parsed with XmlSlurper. The (identical) XML tag names should be replaced with the value of the id attribute; all other attributes should be dropped. Starting from this code:

def xml = """<tag id="root">
            |  <tag id="foo" other="blah" more="meh">
            |    <tag id="bar" other="huh"/>
            |  </tag>
            |</tag>""".stripMargin()

def root = new XmlSlurper().parseText(xml)

// Some magic here.

println groovy.xml.XmlUtil.serialize(root)

I want to get the following:

<root>
  <foo>
    <bar/>
  </foo>
</root>

(I write test assertions on the XML, and want to simplify the structure for them.) I've read Updating XML with XmlSlurper and searched around, but found no way with replaceNode() or replaceBody() to exchange a node while keeping its children.

Was it helpful?

Solution

Adding the 'magic' in to the code in the question gives:

def xml = """<tag id="root">
            |  <tag id="foo" other="blah" more="meh">
            |    <tag id="bar" other="huh"/>
            |  </tag>
            |</tag>""".stripMargin()

def root = new XmlSlurper().parseText(xml)

root.breadthFirst().each { n ->
  n.replaceNode { 
    "${n.@id}"( n.children() )
  }
}

println groovy.xml.XmlUtil.serialize(root)

Which prints:

<?xml version="1.0" encoding="UTF-8"?><root>
  <foo>
    <bar/>
  </foo>
</root>

HOWEVER, this will drop any content in the nodes. To maintain content, we would probably need to use recursion and XmlParser to generate a new doc from the existing one... I'll have a think

More general solution

I think this is more generalised:

import groovy.xml.*

def xml = """<tag id="root">
            |  <tag id="foo" other="blah" more="meh">
            |    <tag id="bar" other="huh">
            |      something
            |    </tag>
            |    <tag id="bar" other="huh">
            |      something else
            |    </tag>
            |    <noid>woo</noid>
            |  </tag>
            |</tag>""".stripMargin()

def root = new XmlParser().parseText( xml )

def munge( builder, node ) {
  if( node instanceof Node && node.children() ) {
    builder."${node.@id ?: node.name()}" {
      node.children().each {
        munge( builder, it )
      }
    }
  }
  else {
    if( node instanceof Node ) {
      "${node.@id ?: node.name()}"()
    }
    else {
      builder.mkp.yield node
    }
  }
}

def w = new StringWriter()
def builder = new MarkupBuilder( w )
munge( builder, root )

println XmlUtil.serialize( w.toString() )

And prints:

<?xml version="1.0" encoding="UTF-8"?><root>
  <foo>
    <bar>something</bar>
    <bar>something else</bar>
    <noid>woo</noid>
  </foo>
</root>

Now passes through nodes with no (or empty) id attributes

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top