Max path triangle

Question 1

It does not use itertools, it is recursive, but I memoize the results, so its still fast...

def memoize(function):
    memo = {}
    def wrapper(*args):
      if args in memo:
        return memo[args]
      else:
        rv = function(*args)
        memo[args] = rv
        return rv
    return wrapper



@memoize
def getmaxofsub(x, y):
    if y  == len(triangle) or x>y: return 0
    #print x, y
    return triangle[y][x] + max(getmaxofsub(x, y+1), getmaxofsub(x+1, y+1))


getmaxofsub(0,0)

I read your algorithm suggestion some more times and your "cumulative triangle" is stored in memoof the memoized decorator, so in the end it is very similar. if you want to prevent that there is big stack during recursive "down calling" through the triangle, you can fill the cache of memoize by calling getmaxofsub() bottom -> up.

for i in reversed(range(len(triangle))):
    getmaxofsub(0, i), getmaxofsub(i//2, i), getmaxofsub(i, i)

print getmaxofsub(0,0)

Edit

getmaxofsub: How does this function work? First you have to know, that you can't divide your triangle in sub triangles. I take your triangle as an example:

That's the complete one. The "coordinates" of the peak are x=0, y=0.

Now I extract the sub triangle of the peak x=0, y=1:

  9
 5 4
9 7 3

or x=1, y=2

 4
7 3

So this is how my algorithm works: The peak of the whole triangle (x=0, y=0) asks its sub triangles (x=0, y=1) and (x=1, y=1), "What is your maximum distance to the ground?" And each of them will ask their sub-triangles and so on… this will go on until the function reaches the ground/y==len(triangle): The ground-entries want to ask their sub triangles, but since their is none of those, they get the answer 0. After each triangle has called their sub triangles, it decides, which one is the greater one, add their own value and return this sum.

So now you see, what is the principle of this algorithm. Those algorithms are called recursive algorithms. You see, a function calling itself is pretty standard… and it works…

So, if you think about this whole algorithm, you would see that a lot of sub-triangles are called several times and they would ask their sub-triangles and so on… But each time they return the same value. That is why I used the memorize-decorator: If a function is called with the same arguments x and y, the decorator returns the last calculated value for those arguments and prevents the time-consuming calculation… It is a simple cache…

That is why this function is as easy to implement as a recursive algorithm and as fast as a iteration...

Question 2

To answer your first question (how to brute-force iterate over all paths): If you start at the top of the triangle and move down along some random path, you have to make a decision to go left or right for every level that you go down. The number of different paths is thus 2^(nrows-1). For your problem with 200 rows, there are thus 8e59 different paths, way to much to check in a brute-force way.

For a small triangle, you can still iterate over all possible paths in a brute-force way, for example like this:

In [10]: from itertools import product
In [11]: triangle = [[5], [9,8], [5,4,6], [9,7,3,4]]
In [12]: for decisions in product((0,1), repeat = len(triangle)-1):
    ...:     pos = 0
    ...:     path = [triangle[0][0]]
    ...:     for lr, row in zip(decisions, triangle[1:]):
    ...:         pos += lr # cumulative sum of left-right decisions
    ...:         path.append(row[pos])
    ...:     print path

[5, 9, 5, 9]
[5, 9, 5, 7]
[5, 9, 4, 7]
[5, 9, 4, 3]
[5, 8, 4, 7]
[5, 8, 4, 3]
[5, 8, 6, 3]
[5, 8, 6, 4]

The way this works is to use itertools.product to iterate over all possible combinations of nrows-1 left/right decisisions, where a 0 means go left and a 1 means go right (so you are more or less generating the bits of all binary numbers up to 2^(nrows-1)). If you store the triangle as a list of lists, going left means staying at the same index in the next row, while going right means adding 1. To keep track of the position in the row, you thus simply calculate the cumulative sum of all left/right decisions.

To answer your second question: First of all, your algorithm seems pretty good, you only need to iterate once backwards over all rows and you do not have the exponential number of cases to check as in the brute-force solution. The only thing I would add to that is to build a new triangle, which indicates at every step whether the maximum was found to the left or to the right. This is useful to reconstruct the optimal path afterwards. All this can be implemented like this:

mx = triangle[-1] # maximum distances so far, start with last row
directions = [] # upside down triangle with left/right direction towards max
for row in reversed(triangle[:-1]): # iterate from penultimate row backwards
    directions.append([l < r for l, r in zip(mx[:-1], mx[1:])])
    mx = [x + max(l, r) for x, l, r in zip(row, mx[:-1], mx[1:])]
    print 'Maximum so far:', mx
print 'The maximum distance is', mx[0]
directions.reverse()
pos = 0
path = [triangle[0][0]]
for direction, row in zip(directions, triangle[1:]):
    pos += direction[pos]
    path.append(row[pos])
print 'The optimal path is', path

As before, I used the trick that False = 0 and True = 1 to indicate going left and right. Using the same triangle as before, the result:

Maximum so far: [14, 11, 10]
Maximum so far: [23, 19]
Maximum so far: [28]
The maximum distance is 28
The optimal path is [5, 9, 5, 9]