Optimisation: loops vs summing generator comprehensions

Question

You can use Python's standard lib cProfile module to see where the time is being consuming in your code. I wrapped the code with a minimum working example and profiled it (not sure if it mimics exactly the behavior of the algorithm but I hope it is useful to make my point):

Code:

import random
moves = 0
lengths = [random.randint(-10,10) for _ in range(101)]

def foo():
    global moves
    minmov = -1
    for target in range(101):
        mov = 0
        for l in lengths:
            mov += abs(l - target)
        if mov < minmov or minmov == -1:
            minmov = mov
    moves += minmov

def foo2():
    global moves
    moves += min(sum(abs(l - t) for l in lengths) for t in range(101))

import cProfile
cProfile.run("foo2()")
#cProfile.run("foo2()")

Profile iterative:

python t.py
         10205 function calls in 0.023 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.023    0.023 <string>:1(<module>)
        1    0.013    0.013    0.023    0.023 t.py:5(foo)
    10201    0.010    0.000    0.010    0.000 {abs}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {range}

Profile functional:

python t1.py
         10409 function calls in 0.023 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.047    0.047 <string>:1(<module>)
        1    0.000    0.000    0.047    0.047 t.py:16(foo2)
      102    0.000    0.000    0.047    0.000 t.py:18(<genexpr>)
    10201    0.010    0.000    0.010    0.000 {abs}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.047    0.047 {min}
        1    0.000    0.000    0.000    0.000 {range}
      101    0.012    0.000    0.046    0.000 {sum}

On my PC (Windows, Python2.7) they almost run at the same speed, but notice how many more function calls does the functional code. Function calls in Python are expensive, sometimes is preferably stick with loop and a simple solution.

By looking at the profile dump, sum is being call 101 times, in your iterative solution, you do this by using the += operator so no function is called.

The functional code move the loops to C, but does so at the cost of 101 (or maybe this 101 just means the lenght of lengths so it would be at the cost of len(lengths)) more function calls.

I bet this is what is causing the additional overhead. You can read Guido's excellent Python Patterns - An Optimization Anecdote essay to have more info on this topic.

I'm not completely sure about min having to iterate the whole array again to extract the minimum as mentioned in the comments. The functional solution is based on generators. The outer generator, min in this case, is the engine that move all the machinery of generators.

When min starts, it will try to find the first element on the generator passed as argument, this in turn will wake the sum method and start the sum's argument generator, yielding abs(l[0] - 0) as the first value of the sum's argument generator, doing so until the whole sum abs(l[0]-0) + abs(l[1]-0)... is calculated, then that will be the first value of the min's argument generator. min will keep track of it when it moves to check the second element in its argument generator and so on.

Implicitly, min is actually keeping track of the minimum value because you're working with generators not list comprehensions.

Hope this helps!