Non-unicode XML representation
Question
I have xml where some of the element values are unicode characters. Is it possible to represent this in an ANSI encoding?
E.g.
<?xml version="1.0" encoding="utf-8"?>
<xml>
<value>受</value>
</xml>
to
<?xml version="1.0" encoding="Windows-1252"?>
<xml>
<value>殘</value>
</xml>
I deserialize the XML and then attempt to serialize it using XmlTextWriter specifying the Default encoding (Default is Windows-1252). All the unicode characters end up as question marks. I'm using VS 2008, C# 3.5
Solution
Okay I tested it with the following code:
string xml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><xml><value>受</value></xml>";
XmlWriterSettings settings = new XmlWriterSettings { Encoding = Encoding.Default };
MemoryStream ms = new MemoryStream();
using (XmlWriter writer = XmlTextWriter.Create(ms, settings))
XElement.Parse(xml).WriteTo(writer);
string value = Encoding.Default.GetString(ms.ToArray());
And it correctly escaped the unicode character thus:
<?xml version="1.0" encoding="Windows-1252"?><xml><value>受</value></xml>
I must be doing something wrong somewhere else. Thanks for the help.
OTHER TIPS
If I understand the question, then yes. You just need a ;
after the 27544
:
<?xml version="1.0" encoding="Windows-1252"?>
<xml>
<value>殘</value>
</xml>
Or are you wondering how to generate this XML programmatically? If so, what language/environment are you working in?
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow