Question

Imagine I have the folling XML file:

<a>before<b>middle</b>after</a>

I want to convert it into something like this:

<a>beforemiddleafter</a>

In other words I want to get all the child nodes of a certain node, and move them to the parent node in order. This is like doing this command: "mv ./directory/* .", but for xml nodes.

I'd like to do this in using unix command line tools. I've been trying with xmlstarlet, which is a powerful command line XML manipulator. I tried doing something like this, but it doesn't work

echo "<a>before<b>middle</b>after</a>" | xmlstarlet ed -m "//b/*" ".."

Update: XSLT templates are fine, since they can be called from the command line.

My goal here is 'remove the links from an XHTML page', in other words replace where the link was, with the contents of the link tag.

Was it helpful?

Solution

If your actual goal is to remove the links from a web page, then you should use a stylesheet like this, which matches all XHTML <a> elements (I'm assuming you're using XHTML?) and simply applies templates to their content:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:h="http://www.w3.org/1999/xhtml"
  exclude-result-prefixes="h">

<!-- Don't copy the <a> elements, just process their content -->
<xsl:template match="h:a">
  <xsl:apply-templates />
</xsl:template>

<!-- identity template; copies everything by default -->
<xsl:template match="node()|@*">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()" />
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

This stylesheet will deal with a situation where you have something nested within the <a> element that you want to retain, such as:

<p>Here is <a href="....">some <em>linked</em> text</a>.</p>

which you will want to come out as:

<p>Here is some <em>linked</em> text.</p>

And it will deal with the situation where you have the link nested within an unexpected element between the usual parent (the <p> element) and the <a> element, such as:

<p>Here is <em>some <a href="...">linked</a> text</em>.</p>

OTHER TIPS

Example input file (test.xml):

<?xml version="1.0" encoding="UTF-8"?>
<test>
<x>before<y>middle</y>after</x>
<a>before<b>middle</b>after</a>
<a>before<b>middle</b>after</a>
<x>before<y>middle</y>after</x>
<a>before<b>middle</b>after</a>
<embedded>foo<a>before<b>middle</b>after</a>bar</embedded>
</test>

XSLT stylesheet (collapse.xsl:

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

      <xsl:template match="@*|node()">
        <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
      </xsl:template>

      <xsl:template match="a">
        <xsl:copy>
          <xsl:value-of select="."/>
        </xsl:copy>
      </xsl:template>

    </xsl:stylesheet>

Run with XmlStarlet using

xml tr collapse.xsl test.xml

Produces:

<?xml version="1.0"?>
<test>
<x>before<y>middle</y>after</x>
<a>beforemiddleafter</a>
<a>beforemiddleafter</a>
<x>before<y>middle</y>after</x>
<a>beforemiddleafter</a>
<embedded>foo<a>beforemiddleafter</a>bar</embedded>
</test>

The first template in the stylesheet is the basic identity transformation (just copies the whole of your input XML document). The second template specifically matches the elements that you want to 'collapse' and just copies the tags and inserts the string value of the element (=concatenation of the string-value of descendant nodes).

In XSLT, you could just write:

<xsl:template match="a"><a><xsl:apply-templates /></a></xsl:template>

<xsl:template match="a/b"><xsl:value-of select="."/></xsl:template>

And you'd get:

<a>beforemiddleafter</a>

So if you wanted to do this the easy way you could just create an XSL stylesheet and run your XML file through that.

I realise this isn't what you said you'd like to do (using Unix command line), however. I don't know anything about Unix, so maybe someone else can fill in the blanks, eg. some sort of command line calls that can execute the above.

Using xmlstarlet:

xmlstr='<a>before<b>middle</b>after</a>'
updatestr="$(echo "$xmlstr" | xmlstarlet sel -T -t -m "/a/b" -v '../.' -n | sed -n '1{p;q;}')"
echo "$xmlstr" | xmlstarlet ed -u "/a" -v "$updatestr"

Have you tried this?

file.xml

<r>
    <a>start<b>middle</b>end</a>
</r>

template.xsl

<xsl:template match="/">
    <a><xsl:value-of select="r/a" /></a>
</xsl:template>

output

<a>startmiddleend</a>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top