Question

I was playing around with Python's collection.deque and wrote the following benchmark:

#!/usr/bin/python

import timeit

if __name__=='__main__':
    number = 1000000
    for r in (1,10,100,1000,5000,10000,100000):
        print r
        print timeit.timeit("while x: x.pop(0);",
                            "x = list(range("+str(r)+"))",
                            number=number)
        print timeit.timeit("while x: x.popleft();",
                            "from collections import deque; x = deque(range("+str(r)+"))",
                             number=number)

This will pop(0)/popleft() from a list/deque with various sizes. The results are:

1
0.0801048278809
0.0822219848633
10
0.081041097641
0.080836057663
100
0.0806250572205
0.0807700157166
1000
0.081248998642
0.082062959671
5000
0.0976719856262
0.0825741291046
10000
0.157499074936
0.0825819969177
100000
16.0247170925
0.097559928894

My question is: Why is the performance for small deques and lists (~1000 elements) almost the same?

Was it helpful?

Solution

The timeit module runs the setup code once, and then the timed code number times (in this case, number==1000000). In your case this looks like (for the list case):

x = list(range(r))
#timer is started here
for iteration in xrange(1000000):
    while x: x.pop(0)
#timer is stopped here

As you can see, only the first iteration does anything, and the other 999999 iterations will just check the size of x once since it will be empty. This check will take approximately the same amount of time for lists and deques.

For small lists/deques, the first iteration is short relative to the other 999999 iterations combined, so you are not really measuring anything relevant, and get similar times.

If you use number==1 you won't have this problem. Another option is to have the timed code push and pop an item so that the list/deque is left at the same size.

OTHER TIPS

I always find timeit more suitable for use from the shell command line. Here, for example:

$ python -mtimeit -s'from collections import deque; base=range(100); ctr=list' 'x=ctr(base)' 'while x: x.pop(0)'
10000 loops, best of 3: 77.3 usec per loop
$ python -mtimeit -s'from collections import deque; base=range(100); ctr=deque' 'x=ctr(base)' 'while x: x.popleft()'
10000 loops, best of 3: 36 usec per loop

The needed precautions (doing the import outside the loop, making a fresh copy of the data within the loop) are so much easier to see and implement, this way...

For low numbers of elements, the time taken to construct the deque and list is significant.

Once the number f elements grows, this it's no longer significant in the results.

Rewrite the test so that construction of lists is outside of the timeit call.

Edit: As @Interjar points out, the initialization of classes is done outside of the method timing so this is not the reason for the similar timings on low numbers of entries.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top