Question

Here is a sample XML file:

<PubmedArticleSet>
<PubmedArticle>
 <MedlineCitation Owner="NLM" Status="MEDLINE">
    <PMID Version="1">23458631</PMID>
    <DateCreated>
        <Year>2013</Year>
        <Month>04</Month>
        <Day>08</Day>
    </DateCreated>
    <MeshHeadingList>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Animals</DescriptorName>
        </MeshHeading>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Calcium</DescriptorName>
            <QualifierName MajorTopicYN="Y">metabolism</QualifierName>
        </MeshHeading>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Calcium Chloride</DescriptorName>
            <QualifierName MajorTopicYN="N">administration &amp; dosage</QualifierName>
        </MeshHeading>
     </MeshHeadingList>
 </MedlineCitation>
</PubmedArticle>
<PubmedArticle>
 <MedlineCitation Status="Publisher" Owner="NLM">
    <PMID Version="1">23458629</PMID>
    <DateCreated>
        <Year>2013</Year>
        <Month>3</Month>
        <Day>20</Day>
    </DateCreated>
    <MeshHeadingList>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Adolescent</DescriptorName>
        </MeshHeading>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Adult</DescriptorName>
        </MeshHeading>
        <MeshHeading>
            <DescriptorName MajorTopicYN="N">Anthropometry</DescriptorName>
        </MeshHeading>
     </MeshHeadingList>
 </MedlineCitation>
</PubmedArticle>
</PubmedArticleSet>

I would like to use XSL parse the XMl file and extract the PMID,DateCreated,all DescriptorName and MajorTopicYN for each article. Then, save the result in a csv file that looks like:

ArticleID|CreatedDate|MeSH|IsMajor
23458631|20130408|Animals|N
23458631|20130408|Calcium|N
23458631|20130408|Calcium Chloride|N
23458629|20130320|Adolescent|N
23458629|20130320|Adult|N
23458629|20130320|Anthropometry|N

Thanks.

Was it helpful?

Solution

Here is what you want:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text" version="1.0" encoding="UTF-8" indent="yes"/>

    <xsl:variable name="newline">
        <xsl:text>&#10;</xsl:text>
    </xsl:variable>

    <xsl:variable name="carriagereturn">
        <xsl:text>&#13;</xsl:text>
    </xsl:variable>

    <xsl:template match="@*|node()">
        <xsl:apply-templates select="@*|node()" />
    </xsl:template>

    <xsl:template match="/">
        <xsl:text>ArticleID|CreatedDate|MeSH|IsMajor</xsl:text>
        <xsl:value-of select="$carriagereturn" />

        <xsl:apply-templates select="@*|node()" />
    </xsl:template>

    <xsl:template match="DescriptorName">
        <xsl:value-of select="ancestor::MedlineCitation/PMID" />
        <xsl:text>|</xsl:text>

        <xsl:value-of select="ancestor::MedlineCitation/DateCreated/Year" />
        <xsl:value-of select="ancestor::MedlineCitation/DateCreated/Month" />
        <xsl:value-of select="ancestor::MedlineCitation/DateCreated/Day" />
        <xsl:text>|</xsl:text>

        <xsl:value-of select="." />
        <xsl:text>|</xsl:text>

        <xsl:value-of select="@MajorTopicYN" />

        <xsl:value-of select="$carriagereturn" />
    </xsl:template>
</xsl:stylesheet>

Good luck with it!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top