Question

I am parsing an XML structure and my classes look like the following:

class MyXml
{
    //...

    List<Node> Content { get; set; }

    //...
}

class Node
{
    // ...

    public List<Node> Nodes { get; set; }
    public string Type { get; set; }

    //...
}

MyXml represents the XML file I am parsing, whose elements are all called <node>. Each node has a type attribute, which can have different values.

The type of the node is not connected to its depth. I can have any node type at any depth level.

I can parse the structure correctly, so I get a MyXml object whose content is a list of Nodes, where ever node in the List can have subnodes and so on (I used recursion for that).

What I need to do is flatten this whole structure and extract only the nodes of a certain type.

I tried with:

var query = MyXml.Content.SelectMany(n => n.Nodes);

but it's taking only the nodes with a structure depth of 1. I would like to grab every node, regardless of depth, in the same collection and then filter what I need.

Was it helpful?

Solution

This is a naturally recursive problem. Using a recursive lambda, try something like:

Func<Node, IEnumerable<Node>> flattener = null;
flattener = n => new[] { n }
    .Concat(n.Nodes == null 
        ? Enumerable.Empty<Node>()
        : n.Nodes.SelectMany(flattener));

Note that when you make a recursive Func like this, you must declare the Func separately first, and set it to null.

You could also flatten the list using an iterator-block method:

public static IEnumerable<Node> Flatten(Node node)
{
    yield return node;
    if (node.Nodes != null)
    {
        foreach(var child in node.Nodes)
            foreach(var descendant in Flatten(child))
                yield return descendant;
    }
}

Either way, once the tree is flattened you can do simple Linq queries over the flattened list to find nodes:

flattener(node).Where(n => n.Type == myType);

Response adapted from: https://stackoverflow.com/a/17086572/1480391

OTHER TIPS

You should implement a method Node.GetFlattened, which returns the node itself and then calls itself on all subnodes:

public IEnumerable<Node> GetFlattened()
{
    yield return this;
    foreach (var node in this.Nodes.SelectMany(n => n.GetFlattened()))
        yield return node;
}

You would then be able to call this method and it recursively returns all nodes regardless of their depth. This is a depth-first search, if you want a breadth-first search, you will have to try another approach.

class MyXml
{
    public List<Node> AllNodes()
    {
        List<Node> allNodes = new List<Node>();
        foreach (var node in Content)
            AddNode(node, nodes);
    }

    public void AddNode(Node node, List<Node> nodes)
    {
        nodes.Add(node);
        foreach (var childNode in node.Nodes)
            AddNode(childNode, nodes);
    }

    public List<Node> AllNodesOfType(NodeType nodeType)
    {
       return AllNodes().Where(n => n.NodeType == nodeType);
    }
}

First flatten the list with a function and query on that.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top