Question

This is not a homework.

Visually it looks like a tree, but all leafs are unique (have unique ids in database). The hierarchy above them is somewhat arbitrary. Each checkbox has 3 states: on, off, and partial. Leaves can be only checked or not checked. The state of the children should drive the state of the parents. Clicking on a checkbox should "toggle" it, and propagate the necessary changes up or down. If you click on a parent that is partially checked, it should become fully checked. Each child has a pointer to a list (I could change this to a set if I must) of parents. Each parent has a list of children sorted alphabetically. At the same time, for display purposes, this structure is a tree that I can expand and collapse, as you would see from the picture below.

I am sure this algorithm has been invented before. Since the number of leaves has been up to 20,000, I do care about the performance in practice. But, I would not try to squeeze every last drop of performance out of the algo at the expense of code being short and readable.

I figured that in principle I should walk down (if there is anywhere to go) and identify all leaves that should be changed. After that I should walk up. From the set of leaves I should figure out a set of parents that might be affected. Then filter it down to the set of parents that will need to change, and to which value. Then Add those to a set. Then I will need to walk up from those nodes and repeat. After I have a set of leaves and other nodes that need to change as well as their values, I will need to just do it ... or something like that. A matrix-based representation would be too expensive.

I am hacking this thing together in C++ using MFC, but my question is pretty much language-agnostic. I would prefer a concrete implementation to an algorithm, however. Some languages like Python, Perl, Scala might have too modern tricks up their sleeves. I would try to stick to something more conventional, like Java, C# (minus LINQ).

Code, links, references and questions are welcome.

alt text

Was it helpful?

Solution

Ah, I see why this is complicated. You want to incrementally topologically sort items as you add them to the "possibly changed" list. That way, you only process elements after you've processed their children; this ensures that you only process changed elements once, and assuming you have a DAG, you will not encounter a situation where you cannot process any elements due to circular references.

So, the general algorithm goes like this:

  • Add all changing leaf children to the set of nodes to process.
  • For every node in the set that has no children to process:
    • Determine its new state.
    • If its state changed, add its parent(s) to the set of nodes to process.

The hard part is the "every node that has no children to process". But that's just a topological sort.

OTHER TIPS

The "partial" state is an issue here.

If you are in a "partial" state, and a child pass "unchecked" should you go "unchecked" too or keep your "partial" state ? This requires querying all other children. I would suggest to modify the structure to keep 2 numbers instead of flags (for non-leaves):

  • the number of children (leaves not direct)
  • the number of checked children (leaves not direct)

You need to maintain them correct at each update, of course.

In order to update them correctly, it's a simple walk from the child to its parents. If you make sure that each child only has one reference to its parent (and the same goes for the parent...), then each time a child change its state, update each of its parents (and thus each of them).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top