Question

In a string element tag the XML parser will get confused if it finds the following characters

'
"
<
>

&

(i.e. lets say the name of company has been retrieved from a database field, and it looks like this: "Smith & Sons")

The question is - how can you design your XSD to ignore these characters if found within an element?

Was it helpful?

Solution

You can't make it ignore these characters.

They are invalid in XML and therefore need to be escaped or wrapped in CDATA sections. There is no way for these characters to show up in XML unless you build the XML using improper means (namely: string concatenation).

If you build your XML using proper means (an XML library of some sorts) these characters are replaced by their XML-escaped counterparts transparently and no parser will complain.

OTHER TIPS

The answer is you dont.

The creator of the XML content should place them in CDATA sections.

If you extract "Smith & Sons" from the database, it should be escaped when inserted into your XML

e.g. the above will become 'Smith &amp; Sons'

Similarly for the other characters above.

How this happens depends on how you build your XML content. If using an API such as DOM, then this should happen automatically. If you're assembling your XML by hand, then you have to worry about this (and other issues like character encoding - which means using an API is the preferable option here).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top