Question

I have an adjacency list of objects (rows loaded from SQL database with the key and it's parent key) that I need to use to build an unordered tree. It's guaranteed to not have cycles.

This is taking wayyy too long (processed only ~3K out of 870K nodes in about 5 minutes). Running on my workstation Core 2 Duo with plenty of RAM.

Any ideas on how to make this faster?

public class StampHierarchy {
    private StampNode _root;
    private SortedList<int, StampNode> _keyNodeIndex;

    // takes a list of nodes and builds a tree
    // starting at _root
    private void BuildHierarchy(List<StampNode> nodes)
    {
        Stack<StampNode> processor = new Stack<StampNode>();
        _keyNodeIndex = new SortedList<int, StampNode>(nodes.Count);

        // find the root
        _root = nodes.Find(n => n.Parent == 0);

        // find children...
        processor.Push(_root);
        while (processor.Count != 0)
        {
            StampNode current = processor.Pop();

            // keep a direct link to the node via the key
            _keyNodeIndex.Add(current.Key, current);  

            // add children
            current.Children.AddRange(nodes.Where(n => n.Parent == current.Key));

            // queue the children
            foreach (StampNode child in current.Children)
            {
                processor.Push(child);
                nodes.Remove(child); // thought this might help the Where above
            }
        }
    }
}

    public class StampNode {
         // properties: int Key, int Parent, string Name, List<StampNode> Children
    }
Was it helpful?

Solution

  1. Put the nodes into a sorted list or dictionary.

  2. Scan that list, pick up each node, find its parent node in the same list (binary search or dictionary lookup), add it to the Children collection of the parent node.

There's no need for a Stack to put this into a tree.

OTHER TIPS

SortedList is not a good container to use in this context. It is O(n) for insertion operations (the repeated calls to Add()), as it is internally represented as a flat list. Using Dictionary instead of SortedList will be a large improvement, as it is O(1) amortized insertion time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top