Question

I am trying to split apart various HTML documents based on the following rules:

  1. For any <h3>, create a new document with a file name based on the contents of the <h3> (with some translation to replace undesired characters)
  2. For each li that is a child of a following-sibling::ol of the h3, create a step element that contains the contents of that <li>, including any children
  3. Copy any following-sibling of the <ol> that occurs before the next <ol> to an <info> element that is a following-sibling of the <step> element created in 2.

So, I have the following input document (this may be a bit much, I've tried to simplify as best I could):

 <body>
<h2><a name="BK_Admin_User_Top" id="BK_Admin_User_Top"></a><b>Page title</b></h2>
<p>Page content.</p>
<p class="Intro">What do you want to do?</p>
<p class="indent-intro hcp1 c1">
  <a href="#Admin_User_New">add a new user</a>
</p>
<p class="indent-intro hcp1 c1">
  <a href="#_Admin_User_Edit" class="hcp2">change an existing user</a>
</p>
<p class="indent-intro hcp1 c1">
  <a href="#_Admin_User_Delete" class="hcp2">delete a user</a>
</p>

<h3><a name="_Admin_User_New" id="_Admin_User_New"></a>Adding a User</h3>
<ol start="1" type="1">
  <li class="p-listLevelOne">
    <p class="listLevelOne">From the 
    <span class="bold">Maintenance</span> folder on the menu tree, click 
    <span class="bold"> Admin Users</span>. </p>
  </li>
</ol>
<p class="SANote hcp3 c2">
<img src="image45.gif" alt="" width="17" height="16" border="0" class="hcp4" /> &#160;If you are a system administrative user, a  Group tab and a  tab for you to enter or select a  group and  for the administrative unit and user precede the  Admin Unit tab for all procedures.</p>
<p class="BGNote">
<i class="hcp3">
<span class="c3">
<span>
  <img src="image45.gif" alt="" width="17" height="16" border="0" class="hcp4" />
</span> &#160;If you are a  group user, a</span>  
<span class="c3">tab for you to enter or select the  for the administrative unit and user precedes the  Admin Unit</span></i> 
<i>
  <span class="c3">tab for all procedures.</span>
</i></p>
<ol start="2" type="1">
  <li class="p-listLevelOne">
    <p class="listLevelOne">Do one of the following:</p>
  </li>
  <li class="p-listSubBullet">
    <p class="listSubBullet">Enter the ID of administrative unit of the user you want to add in the Admin Unit ID field and click 
    <b>NEXT</b>.</p>
  </li>
</ol>

<h3>
<a name="_Admin_User_Edit" id="_Admin_User_Edit"></a>Changing a  User</h3>
<ol>
  <li class="p-listLevelOne">
    <p class="listLevelOne c6">From the 
    <b>Maintenance</b> folder on the menu tree, click 
    <b> Admin Users</b>. The  Admin User Maintenance dialog is displayed with the  Admin Unit tab.</p>
  </li>
</ol>
<p class="SANote c6">
  <span class="hcp8 hcp3 c11">
  <img src="image45.gif" alt="" width="17" height="16" border="0" class="hcp4" /> &#160;If you are a system administrative user, a  Group tab and a  tab for you to enter or select a  group and  for the administrative unit and user precede the  Admin Unit tab for all procedures.</span>
</p>
<p class="BGNote">
<i class="hcp3">
<span class="c3">
<span>
  <img src="image45.gif" alt="" width="17" height="16" border="0" class="hcp4" />
</span> &#160;If you are a  group user, a</span>  
<span class="c3">tab for you to enter or select the  for the administrative unit and user precedes the  Admin Unit</span></i> 
<i>
  <span class="c3">tab for all procedures.</span>
</i></p>
<ol start="2" type="1">
  <li class="p-listLevelOne">
    <p class="listLevelOne">Do one of the following:</p>
  </li>
  <li class="p-listSubBullet">
    <p class="listSubBullet c6">Enter the ID of administrative unit of the user you want to modify in the 
    <b>Admin Unit ID</b> field and click 
    <b>NEXT</b>.</p>
  </li>
  <li class="p-listSubBullet">
    <p class="listSubBullet c6">Click 
    <b>SEARCH</b> to display a list of  administrative units to select from. Select the one you want and click 
    <b>NEXT</b>. You can filter the list by entry in either or both the 
    <b>Admin Unit ID</b> and 
    <b>Admin Unit Name</b> fields.</p>
  </li>
</ol>
<p class="indent hcp9 c7">The  Admin User 
<span class="c12">tab is displayed.</span></p>
<ol start="3" type="1">
  <li class="p-listLevelOne">
    <p class="listLevelOne hcp12 c7">
      <span class="hcp8">Do one of the following:</span>
    </p>
  </li>
  <li class="p-listSubBullet">
    <p class="listSubBullet hcp9 c7">Enter the ID of the user that you want to modify in the 
    <b>User ID</b> field and click 
    <b>EDIT</b>.</p>
  </li>
  <li class="p-listSubBullet">
    <p class="listSubBullet c6">Click 
    <b>SEARCH</b> to display a list of users to select from. Select the one you want and click 
    <b>EDIT</b>. You can filter the list by entry in either or both the 
   User ID and 
    User Name fields.</p>
  </li>
</ol>
<p class="indent hcp9 c7">
<span class="c12">The  Admin User Details</span> 
<span class="c12">tab is displayed with the current values for the selected user.</span></p>
<ol start="4">
  <li class="p-listLevelOne">
    <p class="listLevelOne c6">Change the current data as needed.</p>
  </li>
</ol>
<p class="listLevelOne hcp3 c13">
<span class="c3">
  <img src="image45.gif" alt="" width="17" height="16" border="0" class="hcp4" />
</span> &#160;To reset the user's password, click PASSWORD. &#160;The initial password is system generated and then displayed in the Password field.</p>
<ol start="5">
  <li class="p-listLevelOne">
    <p class="listLevelOne hcp9 c7">If needed at any point, click 
    <b>RESET</b> to return the fields to the last saved values.</p>
  </li>
  <li class="p-listLevelOne">
    <p class="listLevelOne hcp9 c7">Click 
    <b>APPLY</b>. The changes are made to the database.</p>
  </li>
  <li class="p-listLevelOne">
    <p class="listLevelOne hcp9 c7">Click 
    <b>CLOSE</b> to end the maintenance session.</p>
  </li>
</ol>
 </body>

My transformation is:

<?xml version="1.0" encoding="UTF-8"?>

<!-- identity template -->
<xsl:template match="@*|node()" name="identity">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:key name="kFollowing" match="*[preceding-sibling::h3]" 
    use="generate-id(preceding-sibling::h3[1])"/> 

<xsl:template match="h3[preceding::p[@class='Intro']]" priority="10">
    <xsl:variable name="yeti-title">
        <xsl:value-of select="normalize-space(.)"/>                                                    
    </xsl:variable>
    <xsl:variable name="title_filename">
        <xsl:value-of select="translate($yeti-title,' /.()�?','___')"/>
    </xsl:variable>
    <xsl:variable name="newfilename">
        <xsl:value-of select="concat('done\',generate-id(.),'_',$title_filename,'.dita')"/>
    </xsl:variable>
    <xsl:document href="{$newfilename}" method="xml">
        <task>
            <xsl:attribute name="id">
                <xsl:value-of select="$title_filename"/>
            </xsl:attribute>
            <title>
                <xsl:value-of select="$yeti-title"/>
            </title>
            <taskbody>
                <context>              
                    <draft-comment>This topic was automatically generated. Its parent topic is "<xsl:value-of select="/descendant::title[1]"/>."</draft-comment>
                </context>         
                <steps>   
                    <xsl:variable name="h3-id" select="generate-id(.)"/> x
                    <xsl:for-each select="li[key('kFollowing',$h3-id]">
                        <step>
                            <cmd>
                                <xsl:value-of select="."/>
                            </cmd>
                            <xsl:if test="following-sibling::li[1][@class='p-listSubBullet']">
                                <info>
                                    <ul>
                                        <xsl:for-each select="following-sibling::li[@class='p-listSubBullet']">
                                            <li><xsl:value-of select="."/></li>
                                        </xsl:for-each>
                                    </ul>
                                </info>
                            </xsl:if>
                        </step>  
                    </xsl:for-each>
                </steps>
            </taskbody>
        </task>
    </xsl:document>  
</xsl:template>

The desired result is three documents:

The first is already handled by the larger transform:

<?xml version="1.0" encoding="UTF-8"?>
<task>
    <title>Page title</title>
    <taskbody>
        <context>
            <p>What do you want to do?</p>
            <p><xref href="{generated href value}"/></p>
            <p><xref href="{generated href value}"/></p>
            <p><xref href="{generated href value}"/></p>
        </context>
    </taskbody>
</task>

The second document, triggered by the first h3 in the source document:

<task>
    <title>Adding a User</title>
<steps>
    <step><cmd>From the <uicontrol>Maintenance</uicontrol> folder on the 
    menu tree, click <uicontrol> Admin Users</uicontrol>.</cmd>
        <info><note>If you are a system administrative user, a 
            Group tab and a  tab for you to enter or select a  group and  for the
            administrative unit and user precede the  Admin Unit tab for all procedures.</note>
            <note>If you are a  group user, a  
                tab for you to enter or select the  for the administrative unit and 
                user precedes the  Admin Unit tab for all procedures. 
            </note>
     </info></step>
    <step><cmd>Do one of the following:</cmd>
    <info>
        <ul>
            <li>Enter the ID of administrative unit of the user you want to add in the Admin Unit ID field and click  
                <uicontrol>NEXT</uicontrol>.   
            </li>
        </ul>
    </info></step>
</steps>
</task>

And the third, triggered by the second h3.

   <?xml version="1.0" encoding="UTF-8"?>
<task>
    <title>Changing a User</title>
    <steps>
        <step><cmd>From the  
        <uicontrol>Maintenance</uicontrol> folder on the menu tree, click  
        <uicontrol> Admin Users</uicontrol>. The  Admin User Maintenance 
            dialog is displayed with the  Admin Unit tab.</cmd>
        <info><note> If you are a system administrative user, a  Group tab and a 
    tab for you to enter or select a  group and  for the administrative unit
    and user precede the  Admin Unit tab for all procedures.</note>
        <note>If you are a  group user, a tab for you to enter or select the 
         for the administrative unit and user precedes the  Admin Unit tab for all procedures.</note></info></step>
        <step><cmd>Do one of the following:</cmd>
        <info>
            <ul>
                <li>Enter the ID of administrative unit of the user you want to modify in the  
                    <uicontrol>Admin Unit ID</uicontrol> field and click  
                    <uicontrol>NEXT</uicontrol>. </li>
                <li>
                    Click  
                    <uicontrol>SEARCH</uicontrol> to display a list of  administrative units to select from. Select the one you want and click  
                    <uicontrol>NEXT</uicontrol>. You can filter the list by entry in either or both the  
                    <uicontrol>Admin Unit ID</uicontrol> and  
                    <uicontrol>Admin Unit Name</uicontrol> fields. 
                </li>
            </ul><p>The  Admin User tab is displayed.</p> 
        </info></step>
        <step><cmd>Do one of the following:</cmd>
            <info><ul>
                <li>Enter the ID of the user that you want to modify in the  
                    <uicontrol>User ID</uicontrol> field and click  
                    <uicontrol>EDIT</uicontrol>.</li>
                <li>Click  
                    <uicontrol>SEARCH</uicontrol> to display a list of users to select from. 
                    Select the one you want and click  
                    <uicontrol>EDIT</uicontrol>. You can filter the list by entry in either or both the  
                    User ID and  
                    User Name fields.</li>
            </ul>
                <p> The  Admin User Details 
                    tab is displayed with the current values for the selected user.</p></info>
        </step>
        <step><cmd>Change the current data as needed.</cmd>
            <note>To reset the user's password, click PASSWORD. The initial password is system generated and then displayed in the Password field.</note></step>
        <step><cmd>If needed at any point, click  
            <uicontrol>RESET</uicontrol> to return the fields to the last saved values.</cmd>            
    </step>
    <step><cmd>
        Click  
        <uicontrol>APPLY</uicontrol>. The changes are made to the database.
    </cmd></step>
        <step><cmd>Click  
            <uicontrol>CLOSE</uicontrol> to end the maintenance session.</cmd></step></steps>
</task>

Any help is much appreciated!

Was it helpful?

Solution

I was able to get the desired result documents by manually wrapping h3 and "child" content in a div and then using the following transformation. There are some other miscellaneous templates that take care of some smaller details, but the meat of the problem is solved by this.

<xsl:stylesheet version="1.1" 
            xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:key name="kFollowing" match="p[contains(./@class,'Note')]|ul|table" 
 use="generate-id(preceding::p[@class='listLevelOne'][1])"/> 

<xsl:template match="div[child::ol]" priority="10">
  <xsl:variable name="yeti-title">
    <xsl:value-of select="normalize-space(child::h3)"/>                                                    
  </xsl:variable>
  <xsl:variable name="title_filename">
    <xsl:value-of select="translate($yeti-title,' /.()�?','___')"/>
  </xsl:variable>
  <xsl:variable name="newfilename">
    <xsl:value-of select="concat('done\',generate-id(.),'_',$title_filename,'.dita')"/>
  </xsl:variable>
  <xsl:document href="{$newfilename}" method="xml">
    <task>
      <xsl:attribute name="id">
        <xsl:value-of select="$title_filename"/>
      </xsl:attribute>
      <title>
        <xsl:value-of select="$yeti-title"/>
      </title>
      <taskbody>
        <context>              
          <draft-comment>This topic was automatically generated. Its parent topic is "<xsl:value-of select="/descendant::title[1]"/>."</draft-comment>
        </context>  
        <steps>   
          <xsl:for-each select="descendant::p[@class='listLevelOne']">
            <xsl:variable name="ol_id">
              <xsl:value-of select="generate-id()"/>
            </xsl:variable>
            <step>
              <cmd>
                <xsl:apply-templates/>
              </cmd>
                <info>
                  <xsl:apply-templates select="key('kFollowing',$ol_id)"/>
                </info>
            </step>  
          </xsl:for-each>
        </steps>   
      </taskbody>
    </task>  
  </xsl:document>  

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top