Question

I'm converting DITA maps to PDF using the DITA Open Toolkit 1.7 and RenderX XEP. In the DITA topics, product names are inserted using conrefs. One of my product names is quite long. It caused layout problems when used within tables. Therefore I inserted a soft hyphen into the phrase that is reused via conref:

<ph id="PD_FineReader2Comp">DOXiS4 FineReader2&#xad;Components</ph>

This works nicely in the generated pages, but creates a problem in the bookmarks where a symbol is displayed in place of the soft hyphen.

enter image description here

Obviously, this is an encoding problem. It seems that UTF-8 characters are properly handled in PDF content, but not in PDF bookmarks where, according to the following sources, some PDF-16 characters can be used (but I did not understand which ones).

The DITA Open Toolkit seems to create bookmarks from topic titles using this code fragment:

         <fo:bookmark>
            <xsl:attribute name="internal-destination">
                <xsl:call-template name="generate-toc-id"/>
            </xsl:attribute>
                <xsl:if test="$bookmarkStyle!='EXPANDED'">
                    <xsl:attribute name="starting-state">hide</xsl:attribute>
                </xsl:if>
            <fo:bookmark-title>
                <xsl:value-of select="normalize-space($topicTitle)"/>
            </fo:bookmark-title>
            <xsl:apply-templates mode="bookmark"/>
        </fo:bookmark>

The XSL stylesheet has version 2.0.

I would like to create an override that removes the offending character. How can I do this?

  • Is it possible to properly resolve the encoding problem? (Probably not possible).
  • Are there any XSL functions or attributes which remove whitespace other than space, tab, linefeed, and carriage return?
  • Or do I need special handling for the soft hyphen?
Was it helpful?

Solution 2

The simple way to do this is to use the translate() function, which can be used to replace certain characters with other characters, or with nothing. It looks like this is the line that outputs the value you want to fix up:

<xsl:value-of select="normalize-space($topicTitle)"/>

So you could simply modify this to:

<xsl:value-of select="translate(normalize-space($topicTitle), '&#xad;', '')"/>

to remove all the soft hyphens. If you would like to replace them with spaces or ordinary hyphens, you could do either of the following, respectively:

<xsl:value-of select="translate(normalize-space($topicTitle), '&#xad;', ' ')"/>
<xsl:value-of select="translate(normalize-space($topicTitle), '&#xad;', '-')"/>

OTHER TIPS

Small refinement: If you are using XSLT2, will be more efficient than in this context. In XSLT2 you should always prefer xsl:sequence over xsl:value-of

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top