Question

I need to merge two similar xml files, but only records which match on common tags, e.g.<type> in the following example:

file1.xml is

<node>
    <type>a</type>
    <name>joe</name>
</node>
<node>
    <type>b</type>
    <name>sam</name>
</node>

file2.xml is

<node>
    <type>a</type>
    <name>jill</name>
</node>

so that I have an output of

<node>
    <type>a</type>
    <name>jill</name>
    <name>joe</name>
</node>
<node>
    <type>b</type>
    <name>sam</name>
</node>

What are the basics of doing this, in xsl? Many thanks.

Was it helpful?

Solution

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByType" match="*[not(self::type)]" use="../type"/>
    <xsl:param name="pSource2" select="'file2.xml'"/>
    <xsl:variable name="vSource2" select="document($pSource2,/)"/>
    <xsl:template match="node()|@*" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="type">
        <xsl:variable name="vCurrent" select="."/>
        <xsl:call-template name="identity"/>
        <xsl:for-each select="$vSource2">
            <xsl:apply-templates select="key('kElementByType',$vCurrent)"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

With this input (wellformed):

<root>
    <node>
        <type>a</type>
        <name>joe</name>
    </node>
    <node>
        <type>b</type>
        <name>sam</name>
    </node>
</root>

Output:

<root>
    <node>
        <type>a</type>
        <name>jill</name>
        <name>joe</name>
    </node>
    <node>
        <type>b</type>
        <name>sam</name>
    </node>
</root>

OTHER TIPS

I thought it worth adding some extra info I've learned while doing this, in case it's of use to any other beginners. I've changed my test code names so that they aren't potentially confused with some of the terms used in the xsl. I've no idea if it's the best or most efficient way of doing things, but it works (with a few caveats!).

I wanted to keep the "info" node, and the original code lost it. Coding a separate match template keeps it in the output. Also, the way I coded it, this node is only kept if it is in the input file (x1). If it's in the (x2) file, then it doesn't get kept. This has to be with the way I've written the iterations. Ideally, I'd like to keep it from either input file, but haven't worked out how to do that yet. Also, I'd like to have the option of passing the filename x2 as a parameter, via msxsl, rather than have it hard coded. There surely must be a way of doing this, but I haven't managed to track it down yet.

xsl file:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:key name="kElementByType" match="*[not(self::keynode)]" use="../keynode"/>
    <xsl:param name="pSource2" select="'x2.xml'"/>
    <xsl:variable name="vSource2" select="document($pSource2,/)"/>
    <xsl:template match="node()|@*" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="info">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="keynode">
        <xsl:variable name="vCurrent" select="."/>
        <xsl:call-template name="identity"/>
        <xsl:for-each select="$vSource2">
            <xsl:apply-templates select="key('kElementByType',$vCurrent)"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

So, using the msxls command:

msxsl.exe x1.xml test.xsl -o out.xml

Gives the following results with the data below:

file x1.xml:

<root>
    <info>
        <id>147</id>
    </info>
    <nodetype>
        <keynode>annajon</keynode>
        <note>
        <source>source1</source>
        <name>Anna Jones</name>
        </note>
    </nodetype>
    <nodetype>
        <keynode>brucejon</keynode>
        <note>
        <source>source1</source>
        <name>Bruce Jones</name>
        </note>
    </nodetype>
</root>

file x2.xml:

<root>
    <nodetype>
        <keynode>annajon</keynode>
        <note>
        <source>source2</source>
        <name>Anna Jones</name>
        </note>
    </nodetype>
    <nodetype>
        <keynode>iangore</keynode>
        <note>
        <source>source2</source>
        <name>Ian Gore</name>
        </note>
    </nodetype>
</root>

out.xml:

<?xml version="1.0" encoding="UTF-16"?><root>
    <info>
        <id>147</id>
    </info>
    <nodetype>
        <keynode>annajon</keynode><note>
        <source>source2</source>
        <name>Anna Jones</name>
        </note>
        <note>
        <source>source1</source>
        <name>Anna Jones</name>
        </note>
    </nodetype>
    <nodetype>
        <keynode>brucejon</keynode>
        <note>
        <source>source1</source>
        <name>Bruce Jones</name>
        </note>
    </nodetype>
</root>

One way is to pass second xml as a parameter,

Second easier way is to concatenate both xmls under the one root element to

<root>
    <node>
        <type>a</type>
        <name>joe</name>
    </node>
    <node>
        <type>b</type>
        <name>sam</name>
    </node>
    <node>
        <type>a</type>
        <name>jill</name>
    </node>
</root>

and then do merge it using 2

<xsl:template match="/root">
    <xsl:for-each select="node">
        <xsl:variable name="type" select="type"/>
        <node> 
           <type><xsl:value-of select="$type"/></type>
           <xsl:for-each select="../node[type=$type]">
              <name><xsl:value-of select"name"/></name>
           </xsl:for-each>
       </node>
    </xsl:for-each>
</xsl:template>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top