Question

I am using xmlTextWriter to create the xml.

writer.WriteStartElement("book"); 
writer.WriteAttributeString("author", "j.k.rowling"); 
writer.WriteAttributeString("year", "1990");
writer.WriteString("&");
writer.WriteEndElement();

But now i need to write '&' but xmlTextWriter will automatically write this one as "&amp"; So is there any work around to do this?

I am creating xml by reading the doc file.So if I read "-" then in xml i need to write "&ndash";.So while writing it's written as "&amp";ndash.

So, for example, if I am trying to write a node with the text good-bad, I actually need to write my XML such as <node>good&ndash;bad</node>. This is a requirement of my project.

Was it helpful?

Solution

In a proper XML file, you cannot have a standalone & character unless it is an escape character. So if you need an XML node to contain good&ndash;bad, then it will have to be encoded as good&amp;ndash;bad. There is no workaround as anything different would not be valid XML. The only way to make it work is to just write the XML file as a plain text how you want it, but then it could not be read by an XML parser as it is not proper XML.

Here's a code example of my suggested workaround (you didn't specify a language, so I am showing you in C#, but Java should have something similar):

using(var sw = new StreamWriter(stream))
{
    // other code to write XML-like data
    sw.WriteLine("<node>good&ndash;bad</node>");
    // other code to write XML-like data
}

As you discovered, another option is to use the WriteRaw() method on XmlTextWriter (in C#) will write an unencoded string, but it does not change the fact it is not going to be a valid XML file when it is done.

But as I mentioned, if you tried to read this with an XML Parser, it would fail because &ndash is not a valid XML character entity so it is not valid XML.

&ndash; is an HTML character entity, so escaping it in an XML should not normally be necessary.

In the XML language, & is the escape character, so &amp; is appropriate string representation of &. You cannot use just a & character because the & character has a special meaning and therefore a single & character would be misinterpreted by the parser/

You will see similar behavior with the <, >, ", and' characters. All have meaning within the XML language so if you need to represent them as text in a document.

Here's a reference to all of the character entities in XML (and HTML) from Wikipedia. Each will always be represented by the escape character and the name (&gt;, &lt;, &quot;, &apos;)

OTHER TIPS

In XML, & must be escaped as &amp;. The & character is reserved for entities and thus not allowed otherwise. Entities are used to escape characters with special meanings in XML.

Another software reading the XML has to decode the entity again. &lt; for < and &gt; for > or other examples, some other languages like HTML which are based on XML provide even more of these.

I think you will need to encode it. Like so:

colTest = "&" 
writer.WriteEncodedText(colTest)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top