Question

In this page, Microsoft says that XML reserved characters (and their entity) are the following ones:

>    >
<    &lt;
&    &amp;
%    &#37;

But in this other page, I found that also ' is a reserved character (and its entity is &apos;).

Can someone indicate me some official reference in which are listed all and only the XML reserved characters?

Was it helpful?

Solution

According to the XML spec, the only characters that must be escaped when used as character content rather than markup are & (as &amp;, &#38; or &#x26;) and < (as &lt;, &#60; or &#x3C;), plus > when it is part of the sequence ]]>. In addition, single quotes must be escaped (typically as &apos;) in single quoted attribute values, and double quotes (typically as &quot;) in double-quoted attribute values, and any character that is not representable in the character encoding used to serialize the document must be escaped as a suitable character reference.

You don't have to escape double quotes in single quoted attributes or vice versa, but it won't do any harm if you do so.

Of course, you may escape every >, " and ' (and any other character) within character content if you want to, without changing the meaning.

OTHER TIPS

XML doesn't have any notion of "reserved characters".

It has predefined entities which represent the most of the characters which may (depending on context) have special meaning in an XML document (", <, >, & ').

It doesn't have named entities for the space character or = because the places where they have special meaning are places that you can't have data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top