what's the overhead of passing python callback functions to Fortran subroutines?

https://stackoverflow.com/questions/7524941

26-01-2021
|

Question

I just wrapped a Fortran 90 subroutine to python using F2PY. The subtlety here is that the Fortran subroutine aslo takes a python call-back function as one of its arguments:

SUBROUTINE f90foo(pyfunc, a)
real(kind=8),intent(in) :: a
!f2py intent(callback) pyfunc
external pyfunc
!f2py real*8 y,x
!f2py y = pyfunc(x)

!*** debug begins***
print *, 'Start Loop'
do i=1,1000
  p = pyfunc(a)
end do
total = etime(elapsed)
print *, 'End: total=', total, ' user=', elapsed(1), ' system=', elapsed(2)
stop
!*** debug ends  ***

The pyfunc is a python function defined elsewhere in my python code. The wrapper works fine, but running the wrapped version above, I got an elapsed time about factor of 5 times longer than what I can get using pure python as follows,

def pythonfoo(k):
    """ k: scalar 
        returns: scalar
    """
    print('Pure Python: Start Loop')
    start = time.time()
    for i in xrange(1000):
        p = pyfunc(k)
    elapsed = (time.time() - start)
    print('End: total=%20f'% elapsed)

So, the question is, what is the overhead coming from? I really want to leave pyfunc as is because it is extremely time-consuming to re-code it into pure fortran function, so is there any way to improve the speed of the wrapper module?

Solution

In the code you posted, a is double precision float. Passing it from Fortran to Python means wrapping the Fortran double to a PyFloat object, which does have a cost. In the pure Python version, k is a PyFloat and you don't pay the price for wrapping it 1000 times.

Another issue is the function call itself. Calling Python functions from C is already bad performance-wise, but calling them from Fortran is worse, because there is an additional layer of code to transform the Fortran function call conventions (regarding the stack etc.) to C function call conventions. When calling a Python function from C, you need to prepare the arguments as Python objects, generally create a PyTuple object to serve as the *args argument of the Python function, make a lookup in the table of the module to get the function pointer...

Last but not least: you need to take care of the array orders when passing 2D arrays between Fortran and Numpy. F2py and numpy can be smart in that regard, but you'll get performance hits if your Python code is not written to manipulate the arrays in Fortran order.

I don't know what pyfunc is meant to do, but if it is close to what you posted, writing the loop in Python, and calling the function only once will save you time. And if you need the intermediate values (p), let your Python function return a Numpy array with all the intermediate values.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow