Question

Let's say I want to divide two variables, in Python 2.* (mainly 6 and 7), that are considered integers. For instance:

a, b = 3, 2
print a/b
# Prints "1"

Now, there are at least two (non-redundant) ways I know of to cause this division to be normal, floating point division (without running a from __future__ import division). They are:

print a*1.0/b      # Of course you could multiply b by 1.0 also 

and

print float(a)/b   # Here you could also have cast b as a float

Does one of these methods have an advantage (in speed) over the other? Does one have more overhead than the other?

Was it helpful?

Solution

>>> timeit.timeit(stmt="a*1.0/b",setup="a,b=3,2",number=100)
4.669614510532938e-05
>>> timeit.timeit(stmt="float(a)/b",setup="a,b=3,2",number=100)
7.18402232422477e-05

From the above, you can tell that simply using a*1.0/b is much faster then using float(a). This is because calling functions in Python are very costly. That being said though, you could do something like:

a,b=float(3),2
print a/b

and you would have the benchmark of:

>>> timeit.timeit(stmt="a/b",setup="a,b=float(3),2",number=100)
2.5144078108496615e-05

This is because you only call float() once, and that is on assignment of a. This in turn doesn't require the 1.0*a to be factored in, giving a much faster result.

Breaking this down further using the dis module, you can see the actual calls for this in a loop:

float during division

def floatmethod():
    a,b=3,2
    while True:
        print float(a)/b

float during division dis results

dis.dis(floatmethod)
  2           0 LOAD_CONST               3 ((3, 2))
              3 UNPACK_SEQUENCE          2
              6 STORE_FAST               0 (a)
              9 STORE_FAST               1 (b)

  3          12 SETUP_LOOP              25 (to 40)
        >>   15 LOAD_GLOBAL              0 (True)
             18 POP_JUMP_IF_FALSE       39

  4          21 LOAD_GLOBAL              1 (float)
             24 LOAD_FAST                0 (a)
             27 CALL_FUNCTION            1
             30 LOAD_FAST                1 (b)
             33 BINARY_DIVIDE       
             34 PRINT_ITEM          
             35 PRINT_NEWLINE       
             36 JUMP_ABSOLUTE           15
        >>   39 POP_BLOCK           
        >>   40 LOAD_CONST               0 (None)
             43 RETURN_VALUE        

Reason for speed decrease

The reason that this method is much slower is because it must first LOAD_GLOBAL: float, then grab the value of a (LOAD_FAST: a) then it calls float(a) (CALL_FUNCTION). It then finally executes the division (BINARY_DIVIDE), all of which done over and over during the loop.

float on assignment

def initfloatmethod():
    a,b=float(3),2
    while True:
        print a/b

float on assignment dis results

dis.dis(initfloatmethod)
  2           0 LOAD_GLOBAL              0 (float)
              3 LOAD_CONST               1 (3)
              6 CALL_FUNCTION            1
              9 LOAD_CONST               2 (2)
             12 ROT_TWO             
             13 STORE_FAST               0 (a)
             16 STORE_FAST               1 (b)

  3          19 SETUP_LOOP              19 (to 41)
        >>   22 LOAD_GLOBAL              1 (True)
             25 POP_JUMP_IF_FALSE       40

  4          28 LOAD_FAST                0 (a)
             31 LOAD_FAST                1 (b)
             34 BINARY_DIVIDE       
             35 PRINT_ITEM          
             36 PRINT_NEWLINE       
             37 JUMP_ABSOLUTE           22
        >>   40 POP_BLOCK           
        >>   41 LOAD_CONST               0 (None)
             44 RETURN_VALUE      

Reason for speed increase

You can see that on the line in which the division is performed, it no longer has to call the float function, allowing immediate execution of the division. It simply calls LOAD_GLOBAL: float and calls CALL_FUNCTION once, which is on assignment, rather then in the loop. This means it can skip straight to the BINARY_DIVIDE call.

Stats used for this benchmark:

Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32

OTHER TIPS

Using Python 2.7.3:

In [7]: %timeit a*1.0/b
10000000 loops, best of 3: 165 ns per loop

In [8]: %timeit float(a)/b
1000000 loops, best of 3: 228 ns per loop

So the first method appears slightly faster.

That said, it is always worthwhile to profile your code before embarking on micro-optimizations.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top