Question

I know it has been answered before here XSL: how to copy a tree, but removing some nodes?, but I have a more complex XML file and that didn't work very well.

This whole XML and XSLT is new for me, and my boss assigned me a task to transform a XML (OVF file from VMWare) to another, and deleting some nodes, adding others and updating info. I have both XML files, and my task is to design the XSLT that will transform them.

Here's the original XML:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Built using IBM Image Construction and Composition Tool, version: 1.2.0.1-20121129-1310-255 on: Oct 18, 2013 12:14:22 -->
<Envelope
    xmlns="http://schemas.dmtf.org/ovf/envelope/1" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1"
    xmlns:cloudburst="http://www.ibm.com/websphere/rainmaker/2009/3" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData"
    xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" cloudburst:name="POSTGRES-9.2.4-RHEL-64.X64.xxx.xxx"
    cloudburst:version="1.0.0" cloudburst:build="sample" cloudburst:serviceLevel="0"
    cloudburst:description="BASEIMAGE FOR POSTGRESQL 9.2.4" cloudburst:symbolicName="POSTGRES-9.2.4-RHEL-64.X64.xxx.xxx">
  <References>
    <File ovf:href="en-US-bundle.msg" ovf:id="en-US-bundle.msg" ovf:size="18526"/>
    <File ovf:href="de-DE-bundle.msg" ovf:id="de-DE-bundle.msg" ovf:size="20687"/>
    <File ovf:href="es-ES-bundle.msg" ovf:id="es-ES-bundle.msg" ovf:size="20364"/>
    <File ovf:href="fr-FR-bundle.msg" ovf:id="fr-FR-bundle.msg" ovf:size="20534"/>
    <File ovf:href="it-IT-bundle.msg" ovf:id="it-IT-bundle.msg" ovf:size="20138"/>
    <File ovf:href="ja-JP-bundle.msg" ovf:id="ja-JP-bundle.msg" ovf:size="23116"/>
    <File ovf:href="ko-KR-bundle.msg" ovf:id="ko-KR-bundle.msg" ovf:size="19114"/>
    <File ovf:href="pt-BR-bundle.msg" ovf:id="pt-BR-bundle.msg" ovf:size="20204"/>
    <File ovf:href="zh-CN-bundle.msg" ovf:id="zh-CN-bundle.msg" ovf:size="16875"/>
    <File ovf:href="zh-TW-bundle.msg" ovf:id="zh-TW-bundle.msg" ovf:size="18395"/>
    <File ovf:href="Automation.topology" ovf:id="Automation.topology" ovf:size="196121"/>
    <File ovf:href="Semantic.topology" ovf:id="Semantic.topology" ovf:size="34496"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis.vmdk"
        ovf:size="3129636864"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_1.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_1.vmdk"
        ovf:size="470930944"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_2.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_2.vmdk"
        ovf:size="597504"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_3.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_3.vmdk"
        ovf:size="8147968"/>
    <File ovf:href="default1382090373335.xml" ovf:id="default1382090373335.xml"
        ovf:size="17914" cloudburst:part2Definition="true"/>
    <File ovf:href="default1382090373335C.xml" ovf:id="default1382090373335C.xml"
        ovf:size="15854" cloudburst:part2Definition="true"/>
  </References>
</Envelope>

(this is only the first parent node, there are more below, but I think that knowing how to do the first part, the rest will be easier)

It has to look like this:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Built using IBM Image Construction and Composition Tool, version: 1.2.0.1-20121129-1310-255 on: Oct 18, 2013 12:14:22 -->
<Envelope
    xmlns="http://schemas.dmtf.org/ovf/envelope/1" xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1"
    xmlns:cloudburst="http://www.ibm.com/websphere/rainmaker/2009/3" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData"
    xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" cloudburst:name="POSTGRES-9.2.4-RHEL-64.X64.xxx.xxx"
    cloudburst:version="1.0.0" cloudburst:build="sample" cloudburst:serviceLevel="0"
    cloudburst:description="BASEIMAGE FOR POSTGRESQL 9.2.4" cloudburst:symbolicName="POSTGRES-9.2.4-RHEL-64.X64.xxx.xxx">
  <References>
    <File ovf:href="en-US-bundle.msg" ovf:id="en-US-bundle.msg" ovf:size="18526"/>
    <File ovf:href="Automation.topology" ovf:id="Automation.topology" ovf:size="196121"/>
    <File ovf:href="Semantic.topology" ovf:id="Semantic.topology" ovf:size="34496"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis.vmdk"
        ovf:size="3129636864"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_1.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_1.vmdk"
        ovf:size="470930944"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_2.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_2.vmdk"
        ovf:size="597504"/>
    <File ovf:href="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_3.vmdk" ovf:id="RedHat6-4-64-Base-PRB-HARDENEDv1-1-bis_3.vmdk"
        ovf:size="8147968"/>
    <File ovf:href="default1382090373335.xml" ovf:id="default1382090373335.xml"
        ovf:size="17914" cloudburst:part2Definition="true"/>
    <File ovf:href="default1382090373335C.xml" ovf:id="default1382090373335C.xml"
        ovf:size="15854" cloudburst:part2Definition="true"/>
  </References>
</Envelope>

As you can see, what I have to do is to select all File nodes which contain "bundle" and get rid of them, except for the first one (which contains en-US). The xPath I have written that selects them is

/Envelope/References/File[contains(@ovf:href, 'bundle')][position()>1]

(I have had trouble with this because -I think- all the namespaces, but I tried it in Altova XMLspy and it worked flawlessly)

As I have never programmed with XSL, it's a bit different from all I know (Mostly C, Java, PHP, VB.net...) but I know HTML so the basic structure is known to me.

So, my question is, what would the XSL look like to copy the whole XML but ignore that subset of File nodes?

This didn't work, which I copied from that SO answer I linked before

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions" >

    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="/Envelope/References/File[contains(@href, 'bundle')][position()>1]"/> <!-- this empty template will remove them -->
</xsl:stylesheet>

I think it doesn't matter if I use XSL v1 or v2, actually I don't know the differences between them :D

Thanks

Was it helpful?

Solution

It is because of the namespaces. In your input XML, you've defined a default namespace with xmlns="http://schemas.dmtf.org/ovf/envelope/1 and an ovf namespace xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1. File elements belong to the default namespace and the @href attributes belong to the ovf namespace. These namespaces happen to be equal.

You need to define the same namespace in your XSLT, then match elements and attributes using that namespace. (Note that you can call the namespace whatever you like, as long as its value matches the appropriate one in your input. I've called it ns below.)

The following stylesheet will remove all but the first File node that contains "bundle".

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
                xmlns:ns="http://schemas.dmtf.org/ovf/envelope/1">
  <xsl:output method="xml" indent="yes" />
  <xsl:strip-space elements="*"/>

  <xsl:template match="node() | @*">
    <xsl:copy>
      <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
  </xsl:template>

  <!-- this empty template will remove them -->
  <xsl:template match="ns:Envelope/ns:References/ns:File[contains(@ns:href, 'bundle')][position()>1]"/>
</xsl:stylesheet>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top