Question

When transforming some xml (dita) documents there is a problem with special characters, ampersand and the "less than" character (<). But to take the ampersand as the example, what is happening is that it is repeated several times in the output for some reason. The thing is, I discovered that if &amp; is replaced (directly in the xml content) with &#38; then it works fine, no repetitions.

I don't know what is causing this, but what I want to do is to declare the ampersand in the DTD to replace it with &#38;. I found by googling that you should do that like this:

<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd" [
 <!ENTITY amp "&#38;#38;">
]>

For simplicity, I'm including inline entity declaration here, but I also tried declaring it in the actual DTD (concept.dtd). But it doesn't work. It doesn't even seem to kick in, because I tried replacing it with <!ENTITY amp "TEST"> as well, just to see if it did anything at all, and it didn't.

So how do I do this? I just want the &amp; to be replaced by &#38; so I don't have to manually replace every occurrence in every document...

Was it helpful?

Solution 2

I got the answer from Oxygen support. Not sure this helps anyone, it's rather an unusual situation... But the xercesImpl.jar library from the custom DITA OT had to be included in the transformation scenario. Then it worked.

OTHER TIPS

If you have an XML processing pipeline that does the right thing with &#38; and the wrong thing with &amp;, then you have a broken XML processing pipeline. Something in your code is bungling the ampersands. You should fix the code rather than trying to work around it by modifying your XML documents.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top