xercesc is not parsing chinese characters properly and creating xml tag without proper end tag

StackOverflow https://stackoverflow.com/questions/19981100

  •  30-07-2022
  •  | 
  •  

Question

The input string contains "CDR 2324-5 No Compatibilit頩nt. nж06....tri饠par prodkjdl".

The expected xml is :

<param>
<name>comments</name>
<value>CDR   2324-5  No Compatibilit頩nt. nж06....tri饠par prodkjdl</value>
</param>

I am using following function to insert the string into the xml.

 DOM_Text newNode = document.createTextNode("");
 newNode.setNodeValue( (const sChar*) value );
 element.appendChild( newNode );

Where the 'value' is "CDR 2324-5 No Compatibilit頩nt. nж06....tri饠par prodkjdl ".

The XML is got generated but without the of end of tag. I got following error message,

Error: The input ended before all started tags were ended. Last tag started was 'param'.

If i remove those 3 Chinese characters, then everything working fine.

I am using Linux RHEL 62.
xercesc version taken from XercesVersion.hpp is,

XERCES_VERSION_MAJOR 2
XERCES_VERSION_MINOR 5

Same thing working fine in windows machine.
i suspect that, i may use some deprecated version or function. but i am not sure.

Please let me know your suggestions.

Was it helpful?

Solution

This has been fixed, the sChar* was not able to handle the Chinese characters. I have converted the string into DOMString. now everything works fine.

DOMString val = DOMString( value.c_str() );
DOM_Text newNode = document.createTextNode("");
newNode.setNodeValue( val );
element.appendChild( newNode );

Now i am seeing proper end of tag .

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top