Speeding up my Fermat Factorization function (Python)

https://stackoverflow.com/questions/12592916

03-07-2021
|

Frage

My task is to factor very large composite numbers using Fermat's factorization method. The numbers are 1024 bits large, which is around 309 decimal digits.

I have come up with the Python code below, which uses the gmpy2 module for accuracy. It is simply a Python implementation of the pseudo-code shown on the Wikipedia page. I read the "Sieve Improvement" section on that page, but wasn't sure how to implement it.

def fermat_factor(n):
    assert n % 2 != 0  # Odd integers only

    a = gmpy2.ceil(gmpy2.sqrt(n))
    b2 = gmpy2.square(a) - n
    while not is_square(b2):
        a += 1
        b2 = gmpy2.square(a) - n

    factor1 = a + gmpy2.sqrt(b2)
    factor2 = a - gmpy2.sqrt(b2)
    return int(factor1), int(factor2) 

def is_square(n):
    root = gmpy2.sqrt(n)
    return root % 1 == 0  # '4.0' will pass, '4.1212' won't

This code runs fairly fast for small numbers, but takes much too long for numbers as large as those given in the problem. How can I improve the speed of this code? I'm not looking for people to write my code for me, but would appreciate some suggestions. Thank you for any responses.

Lösung

Consider rewriting this script to use only integers instead of arbitrary precision floats.

gmpy has support for integer square root (returns the floor of the square root, calculated efficiently). This can be used for the is_square() function by testing if the square of the square root equals the original.

I'm not sure about gmpy2, but in gmpy.sqrt() requires an integer argument, and returns an integer output. If you are using floats, then that is probably your problem (since floats are very slow as compared to integers, especially when using extended precision). If you are in fact using integers, then is_square() must be doing a tedious conversion from integer to float every time it is called (and gmpy2.sqrt() != gmpy.sqrt()).

For those of you who keep saying that this is a difficult problem, keep in mind that using this method was a hint: The fermat factorization algorithm is based on a weakness present when the composite number to be factored has two prime factors which are close to each other. If this was given as a hint, it is likely that the entity posing the problem knows this to be the case.

Edit: Apparently, gmpy2.isqrt() is the same as gmpy.sqrt() (the integer version of sqrt), and gmpy2.sqrt() is the floating-point version.

Andere Tipps

You need to avoid doing so many square and sqrt operations, especially on large numbers.

The easy way to avoid them is to note that a^2 - N = b^2 must be true for all moduli to be a solution. For example,

a^2 mod 9 - N mod 9 = b^2 mod 9

Let's say your N is 55, so N mod 9 = 1.

Now consider the set of (a mod 9), and square it, modulo 9. The resulting a^2 mod 9 is the set: {0, 1, 4, 7}. The same must be true for the b^2 mod 9.

If a^2 mod 9 = 0, then 0 - 1 = 8 (all mod 9) is not a solution, since 8 is not a square of a number modulo 9. This eliminates (a mod 9) = {0, 3 and 6}.

If a^2 mod 9 = 1, the 1 - 1 = 0 (all mod 9), so (a mod 9) = {1, 8} are possible solutions.

If a^2 mod 9 = 4, then 4 - 1 = 3 (all mod 9) is not a possible solution. Ditto for a^2 mod 9 = 7.

So, that one modulus eliminated 7 out of 9 possible values of 'a mod 9'.

And you can have many moduli, each one eliminating at least half of the possibilities. With a set of, say, 10 moduli, you only have to check about 1 in 1,000 a's for being perfect squares, or having integer square roots. (I use about 10,000 moduli for my work).

Note: Moduli which are powers of a prime are often more useful than a prime. Also, a modulus of 16 is a useful special case, because 'a' must be odd when N mod 4 is 1, and 'a' must be even when N mod 4 is 3. "Proof is left as an exercise for the student."

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow