Is there any way to use somehow use pypy just to compile one function and not for the rest of my python program?

I have a known bottleneck where I spend 99% of my CPU time (containing mostly integer shifts and XORs) and have optimized it as much as I can in Python. I don't want to have to write and maintain C libraries unless absolutely positively necessary.

Right now I'm using Anaconda Python which is the normal python with a bunch of libraries. I would use pypy except that I don't want to have to make sure that all the rest of my program works ok w/ pypy.

Is there a way to explicitly run a JIT only on one Python function?


edit: the function is a modular multiplication step in GF2 (Galois field)

https://bitbucket.org/jason_s/libgf2/src/a71a14a035468e703a7c24349be814419cdba2cb/src/libgf2/gf2.py?at=default

specifically:

def _gf2mulmod(x,y,m):
    z = 0
    while x > 0:
        if (x & 1) != 0:
            z ^= y
        y <<= 1
        y2 = y ^ m
        if y2 < y:
            y = y2
        x >>= 1
    return z

It needs to work for bigints so I'm not sure how to rewrite to be Cython-compatible.

I just tried numba's @autojit and it failed because it didn't know what variable types to use and assumed small integers. I can't seem to figure out how to tell it to use standard Python bigints.

Traceback (most recent call last):
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 440, in <module>
    dlog43 = GF2DiscreteLog(0x100000000065)
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 295, in __init__
    factors = _calculateFactors(poly)
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 264, in _calculateFactors
    if (e1 << m).value != 1:
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 379, in __lshift__
    return self._wrapraw(_gf2lshiftmod(self.value,k,self.poly))
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 195, in _gf2lshiftmod
    return _gf2mulmod(x,_gf2powmod(2,k,m),m)
  File "/Users/jason_s/Documents/python/libgf2/src/libgf2/gf2.py", line 189, in _gf2powmod
    z = _gf2mulmod(z,x,m)
  File "numbawrapper.pyx", line 193, in numba.numbawrapper._NumbaSpecializingWrapper.__call__ (numba/numbawrapper.c:3764)
OverflowError: value too large to convert to signed int
有帮助吗?

解决方案

What about using Cython instead? You could convert just this one function to cython syntax, which is than compiled directly to C or so. The syntax should be close enough to python itself, probably just add a few declarations of the right types.

其他提示

No, you cannot run part of a Python program in PyPy and other parts in another Python- It's more than just a JIT, it has a completely different representation of objects and many other internals.

If your sole concern is not wanting to make sure the rest of the program works with PyPy, rest assured: Virtually all pure Python code works with PyPy, with the sole exception of CPython implementation details. Those are obscure, it's quite hard to accidentially write code that relies on most of them, and the others (e.g. files not being closed automatically as promptly) won't break most programs. Just try running your whole program with PyPy.

If there are other issues with PyPy, you might want to translate only this function into C and call it using ctypes or cffi. The annoying part would be hooking it up with Python (e.g. via an extension module), which is what ctypes and cffi do for you. You don't need a complete arbitary precision integer library, you only need an array of bits with a few very simple operations: testing the least significant bit, left/right shifting, less-than, and bitwise XOR. Each of those is only one trivial loop. And if the naive C implementation is still a bottleneck, you can probably vectorize all of these loops. You could probably also optimize the shifts to avoid copying anything at all.

Cython has already been mentioned a few times, but I'd like to point out another alternative for those who find this question in their search, who might not have the arbitrary-precision integer requirement: ShedSkin. It translates Python to C++ via type inference. In other words, unlike Cython, you don't add type declarations to Python; you just write Python. But you do have to make sure your variables don't change type within your program, otherwise the type cannot be inferred. Certain other of the more dynamic features of Python are also not supported, but this is usually not an issue for computation-intensive code.

As for the arbitrary-precision integer requirement: You will likely get much, much bigger speed improvements than the 2.5x you noted in your comment, for integers that fit in "native" (fixed-size) integers. The improvement is so drastic that if a good portion of your calculations can be done with native integers, it is probably worth having a (blazingly fast) native-integer version of your function just for that case, and only using the general version of your function for values that don't fit. (Not all problems can be broken into separate cases like this, but for those that can, it's worth at least checking out this approach.)

I actually created a python package that does exactly what you are trying to accomplish. It implements numpy arrays over Galois fields. I was able to optimize it by JIT compiling with numba. It also supports arbitrarily-large integers, like you mentioned. https://github.com/mhostetter/galois

In [1]: import numpy as np                                                                     

In [2]: import galois                                                                          

In [3]: GF = galois.GF(2**100)                                                                 

In [4]: print(GF.properties)                                                                   
GF(2^100):
  characteristic: 2
  degree: 100
  order: 1267650600228229401496703205376
  irreducible_poly: Poly(x^100 + x^57 + x^56 + x^55 + x^52 + x^48 + x^47 + x^46 + x^45 + x^44 + x^43 + x^41 + x^37 + x^36 + x^35 + x^34 + x^31 + x^30 + x^27 + x^25 + x^24 + x^22 + x^20 + x^19 + x^16 + x^15 + x^11 + x^9 + x^8 + x^6 + x^5 + x^3 + 1, GF(2))
  is_primitive_poly: True
  primitive_element: GF(2, order=2^100)

In [5]: A = GF.Random((2,2)); A                                                                
Out[5]: 
GF([[853109427014456778157610146134, 957579797461316189986596054496],
    [619061399056446662243502059294, 762244553049699515226100523552]],
   order=2^100)

In [6]: B = GF.Random((2,2)); B                                                                
Out[6]: 
GF([[511709585928908961018234228239, 206374347029181859532039074035],
    [795530021671674913470994904012, 918203712488921499667394325749]],
   order=2^100)

In [7]: A + B                                                                                  
Out[7]: 
GF([[1005796869832339943233379227481, 1152773228746217240881872950547],
    [1097497412292991510222532386002, 160953539027946640002225213141]],
   order=2^100)

In [8]: A * B                                                                                  
Out[8]: 
GF([[296314095771552265037299061152, 688536035673482273067277820628],
    [1177970297984569800118703939222, 537328370564266356643331706738]],
   order=2^100)

In [9]: A / A                                                                                  
Out[9]: 
GF([[1, 1],
    [1, 1]], order=2^100)

# Fermat's Little Theorem
In [10]: A ** (GF.order - 1)                                                                   
Out[10]: 
GF([[1, 1],
    [1, 1]], order=2^100)

In [11]: A @ B                                                                                 
Out[11]: 
GF([[1027776659503691614378629238339, 470187664463292378234435322369],
    [86178777179053209582733631256, 172677144553521647820627674227]],
   order=2^100)

In [12]: np.linalg.inv(A) @ A                                                                  
Out[12]: 
GF([[1, 0],
    [0, 1]], order=2^100)
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top