How do I turn a deserialized XML object (.NET) into a single collection of dot separated named key values?

StackOverflow https://stackoverflow.com/questions/19106509

  •  29-06-2022
  •  | 
  •  

سؤال

To start, I am constrained to .NET 2.0 so LINQ is not an option for me (though I would be curious to see a LINQ solution as fodder for pushing to move to .NET 3.5 for the project if it is easy).

I have an XSD that is turned into a set of C# classes via xsd.exe at build time. At runtime, an XML file is loaded and deserialized into the C# classes (validation occurs at this time). I need to then turn that in-memory configuration object (including the default values that were populated during import of the XML file) into a dictionary of key value pairs.

I would like the dictionary key to be a dot separated path to the value. Attribute values and element text would be considered values, everything else along the way a key into that.

As an example, imagine the following XML file:

<rootNode>
    <foo enabled="true"/>
    <bar enabled="false" myAttribute="5.6">
        <baz>Some Text</baz>
        <baz>Some other text.</baz>
    </bar>
</rootNode>

would turn into a dictionary with keys like:

"rootNode.foo.enabled" = (Boolean)true
"rootNode.bar.enabled" = (Boolean)false
"rootNode.bar.myAttribute" = (Float)5.6
"rootNode.bar.baz" = List<String> { "Some Text", "Some other text." }

Things of note are that rootNode is left off not because it is special but because it had no text or attributes. Also, the dictionary is a dictionary of objects which are typed appropriately (this is already done in deserialization, which is one of the reasons I would like to work with the C# object rather than the XML directly).

Interestingly, the objects created by xsd.exe are already really close to the form I want. The class names are things like rootNodeFoo with a float field on it called myAttribute.

One of the things I have considered but am not sure how to go about are using reflection to iterate over the object tree and using the names of the classes of each object to figure out the name of the node (I may have to tweak the casing a bit). The problem with this is that it feels like the wrong solution since I already have access to a deserializer that should be able to do all of that for me and much faster.

Another option would be using XSLT to serialize the data directly to a format that is how I want. The problem here is that my XSLT knowledge is limited and I believe (correct me if I am wrong) I will lose typing on the way (everything will be a string) so I will have to essentially deserialize once again by hand to get the types back out (and this time without XSD validation that I get when I use the .NET deserializer).

In case it matters, the calls I am using to get the configuration object populated from an XML file is something like this:

var rootNode = new XmlRootAttribute();
rootNode.ElementName = "rootNode";
rootNode.Namespace = "urn:myNamespace";
var serializer = new XmlSerializer(typeof(rootNode), rootNode);
using (var reader = new StringReader(xmlString))
{
    var deserializedObject = (rootNode)serializer.Deserialize(reader);
}
هل كانت مفيدة؟

المحلول

First observation: using the object graph is not the best place to start to generate a dot representation. You're talking about nodes which have names and are in a well-defined hierarchy and you want to produce some kind of dot notation from it; the xml DOM seems to be the best place to do this.

There are a few problems with the way you describe the problem.

The first is in the strategy when it comes to handling multiple elements of the same name. You've dodged the problem in your example by making that dictionary value actually a list, but suppose your xml looked like this:

<rootNode>
    <foo enabled="true">
        <bar enabled="false" myAttribute="5.6" />
        <bar enabled="true" myAttribute="3.4" />
    </foo>
</rootNode>

Besides foo.enabled = (Boolean)true which should be fairly obvious, what dictionary keys do you propose for the two myAttribute leaves? Or would you have a single entry, foo.bar.myAttribute = List<float> {5.6, 3.4}? So, problem #1, there's no unambiguous way to deal with multiple similarly-named non-leaf nodes.

The second problem is in selecting a data type to do the final conversion at leaf nodes (i.e. attribute or element values). If you're writing to a Dictionary<string, object>, you will probably want to select a type based on the Schema simple type of the element/attribute being read. I don't know how to do that, but suggest looking up the various uses of the System.Convert class.

Assuming for the moment that problem #1 won't surface, and that you're ok with a Dictionary<string, string> implementation, here's some code to get you started:

static void Main(string[] args)
{
    var xml = @"
<rootNode>
    <foo enabled=""true"">
         <bar enabled=""false"" myAttribute=""5.6"" />
         <baz>Text!</baz>
    </foo>
</rootNode>
";

    var document = new XmlDocument();
    document.LoadXml(xml);
    var retVal = new Dictionary<string, string>();
    Go(retVal, document.DocumentElement, new List<string>());
}

private static void Go(Dictionary<string, string> theDict, XmlElement start, List<string> keyTokens)
{
    // Process simple content
    var textNode = start.ChildNodes.OfType<XmlText>().SingleOrDefault();
    if (textNode != null)
    {
        theDict[string.Join(".", keyTokens.ToArray())] = textNode.Value;
    }

    // Process attributes
    foreach (XmlAttribute att in start.Attributes)
    {
        theDict[string.Join(".", keyTokens.ToArray()) + "." + att.Name] = att.Value;
    }

    // Process child nodes
    foreach (var childNode in start.ChildNodes.OfType<XmlElement>())
    {
        Go(theDict, childNode, new List<string>(keyTokens) { childNode.Name });   // shorthand for .Add
    }
}

And here's the result:

Result from running sample code

نصائح أخرى

One approach would be to implement a customer formatter and slot it into the standard serialization pattern, create a class that implements IFormatter i.e. MyDotFormatter

http://msdn.microsoft.com/en-us/library/system.runtime.serialization.iformatter.aspx

then implement as below

Stream stream = File.Open(filename, FileMode.Create);
MyDotFormatter dotFormatter = new MyDotFormatter();
Console.WriteLine("Writing Object Information");
try
{
dotFormatter.Serialize(stream, objectToSerialize);
}
catch (SerializationException ex)
{
Console.WriteLine("Exception for Serialization data : " + ex.Message);
throw;
}
finally
{
stream.Close();
Console.WriteLine("successfully wrote object information");
}
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top