Question

I have a hierarchical organization that is a tree where the node is a child if the parent sponsors the child. It seems I can traverse the tree with this code

def get_team(self, person, team):
    firstline = User.query(User.sponsor == person.key).fetch(99999999)
    if firstline:
        for person in firstline:
            team.append(person)
            newdownline = self.downline(person, team)        
    return team

Using the above I can get a user's organization just by

downline=user.get_team(user, [])

But is there a more efficient way since I have to do this many times for a single request and that much recursion might be ineffecient? Or will the code be fine since it can traverse the tree correctly? In my first version I used three variables and I found I could rearrange the code to only two variables instead of this:

def downline(self, person, team, teamlist):
    firstline = User.query(User.sponsor == person.key).fetch(99999999)
    if firstline:
        for person in firstline:
            teamlist.append(person)
            newdownline = self.downline(person, team, teamlist)        
            team.append(newdownline)
    return teamlist 

I found that the teamlist variable wasn't really needed so I removed it. The way I did it was first with one variable too many:

people = user.downline(user, [], [])

Was it helpful?

Solution

Yes, there is a more efficient way to do it, depending on your computational tradeoffs.

You are currently doing a depth-first traversal of your tree, a perfectly fine approach. You could add some speed by caching results at the expense of some RAM usage, so:

if person.id in downline_cache:
      team.append(downline_cache[person.id])
else:
      downline_cache[person.id] = results
      team.append(results)

If the tree is fairly small, you could just cache the whole thing upfront, once per thread. That takes more RAM than what you're doing, but is much faster than doing a depth-first traversal every time you care what the results are. A lot depends on your usage patterns and the amount of data you're storing.

If you use a cache, you must make sure you have some way to deal with the underlying data changing, either with timeouts and 'eventually correct' type guarantees, or keeping track of when and how you must wipe the cache, or elements of the cache.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top