Question

I have a method to determine the most compatible person for every other person. Basically there are two nested loops over the items of a dict that maps from a person to a list (where similar lists determine compatibility), there compat is computed and saved if it is bigger then the previously maximal one for the person of the outer loop.
So I decided to optimize performance by updating the compatibility for the other person (the on in the inner loop) as well, because the compatibility is the same, and I won't have to do the same computation when the outer loop reaches person 2 and the inner one person 1 [use symmetrie of compatibility relation].
Well, I ended up 20 times slower. The c-profile logs are strange, because all operations of the improved version have a better (or similar) totaltime than the ones in the unimproved code. So I'm absolutely stuck finding the bottleneck. :(
Can anybody give me advice on how to interpret this logs? Where is the evil instruction gone?

log of normal code:

     $ python -m cProfile -s time ./jukebox.py sample.txt
         92661414 function calls (92661412 primitive calls) in 124.355 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000   93.324    0.009  124.168    0.012 jukebox.py:88(solve_problem_4)
 42900000   16.616    0.000   16.616    0.000 {method 'intersection' of 'set' objects}
 42900000   10.831    0.000   10.831    0.000 {len}
  6180396    2.212    0.000    2.212    0.000 {method 'append' of 'list' objects}
   670000    1.185    0.000    1.185    0.000 {method 'items' of 'dict' objects}
        1    0.170    0.170  124.353  124.353 jukebox.py:1(<module>)
        1    0.009    0.009    0.013    0.013 heapq.py:31(<module>)
        1    0.004    0.004    0.004    0.004 bisect.py:1(<module>)
        1    0.002    0.002  124.355  124.355 {execfile}
       66    0.001    0.000    0.001    0.000 jukebox.py:18(update_bands)
       67    0.001    0.000    0.001    0.000 fileinput.py:166(isfirstline)
        1    0.000    0.000    0.002    0.002 jukebox.py:9(__init__)
        1    0.000    0.000    0.000    0.000 {open}
      198    0.000    0.000    0.000    0.000 {method 'strip' of 'str' objects}
      132    0.000    0.000    0.000    0.000 {method 'split' of 'str' objects}
        1    0.000    0.000  124.355  124.355 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
       68    0.000    0.000    0.000    0.000 fileinput.py:243(next)
        1    0.000    0.000    0.000    0.000 {range}
        1    0.000    0.000    0.000    0.000 {method 'close' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        2    0.000    0.000    0.000    0.000 {method 'readlines' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      4/2    0.000    0.000    0.000    0.000 fileinput.py:292(readline)
      396    0.000    0.000    0.000    0.000 {method 'setdefault' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
        1    0.000    0.000    0.000    0.000 jukebox.py:4(Reader)
       67    0.000    0.000    0.000    0.000 fileinput.py:371(isfirstline)


log of "optimized" one:

$ python -m cProfile -s time ./jukebox-imp.py sample.txt
         49761414 function calls (49761412 primitive calls) in 2166.613 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10000 2147.248    0.215 2165.759    0.217 jukebox-imp.py:88(solve_problem_4)
 21450000    8.952    0.000    8.952    0.000 {method 'intersection' of 'set' objects}
 21450000    5.951    0.000    5.951    0.000 {len}
  6180396    2.152    0.000    2.152    0.000 {method 'append' of 'list' objects}
   660000    1.441    0.000    1.441    0.000 {method 'items' of 'dict' objects}
        1    0.837    0.837 2166.611 2166.611 jukebox-imp.py:1(<module>)
    10000    0.015    0.000    0.015    0.000 {method 'keys' of 'dict' objects}
        1    0.010    0.010    0.013    0.013 heapq.py:31(<module>)
        1    0.003    0.003    0.003    0.003 bisect.py:1(<module>)
        1    0.002    0.002 2166.613 2166.613 {execfile}
       66    0.002    0.000    0.002    0.000 jukebox-imp.py:18(update_bands)
        1    0.000    0.000    0.000    0.000 {open}
      198    0.000    0.000    0.000    0.000 {method 'strip' of 'str' objects}
      132    0.000    0.000    0.000    0.000 {method 'split' of 'str' objects}
        1    0.000    0.000 2166.613 2166.613 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:240(__iter__)
        1    0.000    0.000    0.002    0.002 jukebox-imp.py:9(__init__)
       68    0.000    0.000    0.000    0.000 fileinput.py:243(next)
        1    0.000    0.000    0.000    0.000 {range}
        1    0.000    0.000    0.000    0.000 {method 'close' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.000    0.000 fileinput.py:80(<module>)
        1    0.000    0.000    0.000    0.000 fileinput.py:184(FileInput)
        1    0.000    0.000    0.000    0.000 fileinput.py:197(__init__)
        2    0.000    0.000    0.000    0.000 {method 'readlines' of 'file' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      4/2    0.000    0.000    0.000    0.000 fileinput.py:292(readline)
      396    0.000    0.000    0.000    0.000 {method 'setdefault' of 'dict' objects}
       67    0.000    0.000    0.000    0.000 fileinput.py:166(isfirstline)
        1    0.000    0.000    0.000    0.000 fileinput.py:91(input)
        1    0.000    0.000    0.000    0.000 fileinput.py:266(nextfile)
       67    0.000    0.000    0.000    0.000 fileinput.py:371(isfirstline)
        1    0.000    0.000    0.000    0.000 jukebox-imp.py:4(Reader)

//EDIT:
Just in case I can provide the code, too. To my humble understanding there is absolutely no reason for the latter to be 20x slower then the former.

The normal one:

def solve_problem_4(colleagues):
MIN_COMPAT = 1
compat_dict = dict()

for colleague_1, bands_1 in colleagues.items():
    compat_dict[colleague_1] = (0,[])
    for colleague_2, bands_2 in colleagues.items():
        if colleague_1 == colleague_2:
            continue

        compat = len(set(bands_1).intersection(set(bands_2)))
        if compat > MIN_COMPAT:
            old_compat,top_colleagues  = compat_dict[colleague_1]
            if compat > old_compat:
                compat_dict[colleague_1] = (compat,[colleague_2])
            elif compat == old_compat:
                top_colleagues.append(colleague_2)

return compat_dict

And the "optmized":

def solve_problem_4(colleagues):
MIN_COMPAT = 1
compat_dict = defaultdict(lambda: (0,[]))  #change here
checked_pairs = []

for colleague_1, bands_1 in colleagues.items()[:-1]:
    for colleague_2, bands_2 in colleagues.items():
        if colleague_1 == colleague_2 or (colleague_2,colleague_1) in checked_pairs:  # change here, exclude used pairs
            continue

        checked_pairs += [(colleague_1,colleague_2)]  # change here, note down checked pair  
        compat = len(set(bands_1).intersection(set(bands_2)))

        if compat > MIN_COMPAT:
            old_compat, top_colleagues  = compat_dict[colleague_1]
            if compat > old_compat:
                compat_dict[colleague_1] = compat,[colleague_2]
            elif compat == old_compat:
                top_colleagues.append(colleague_2)

            old_compat, top_colleagues  = compat_dict[colleague_2] # change here, update symmetric pair
            if compat > old_compat:  # imagine extract method refactoring here ;)
                compat_dict[colleague_2] = compat,[colleague_1]
            elif compat == old_compat:
                top_colleagues.append(colleague_1)
return compat_dict
Was it helpful?

Solution 2

Should be clearer if you sort by cumtime.

OTHER TIPS

Alternatively, running runsnakerun on cProfile dump provides an easy to understand graphical view.

python -m cProfile -o dump.cprofile script.py   
runsnakerun dump.cprofile
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top