Question

I have xml what il get from third party application and structure looks like this:

<root>
 <id>1</id>
 <data>&lt;node&gt;i like it&lt;node&gt;\n\r
  &lt;node&gt;i like it&lt;node&gt;</data>
</root>

As u can see there is a escaped xml inside <data>, also in the end of first line of <data> there is a newline \n\r and 2 spaces in front on the 2 line in <data>.

Here is my deserialize method:

public static root Deserialize(string xml)
{
    System.IO.StringReader stringReader = null;
    try
    {
        stringReader = new System.IO.StringReader(xml);
        return ((root)(Serializer.Deserialize(System.Xml.XmlReader.Create(stringReader))));
    }
    finally
    {
        if ((stringReader != null))
        {
            stringReader.Dispose();
        }
    }
 }

Afther using this method the value of data element is:

"&lt;node&gt;i like it&lt;node&gt;\n  &lt;node&gt;i like it&lt;node&gt;"

And now, my questions are:

Why is the \r removed from the data string? Is there a way to remove the newlines and spaces some other way than using simple string.replace();?

Was it helpful?

Solution

...the value of data element is:

"&lt;node&gt;i like it&lt;node&gt;\n  &lt;node&gt;i like it&lt;node&gt;"

No, the value actually is

"<node>i like it<node>\n  <node>i like it<node>"

Why is the \r removed from the data string?

The \r is normalized. Line-breaks in XML are \n-only, any occurrence of different types of line-breaks is being replaced as part of the XML parsing process.

Is there a way to remove the newlines and spaces some other way than using simple string.replace();?

You could regex-replace \n$\s* with the empty string:

Regex.Replace(data, @"\n\s*", String.Empty, RegexOptions.Multiline)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top