You can use the identity you quoted, you just need another similar identity: (a+b) mod m = (a mod m) + (b mod m)
.
The goal is to multiply x*y mod m
without any intermediate values exceeding the overflow limit (in this case 2^64), where x
is starting less than m
(if it isn't, reduce it mod m
), y
is possibly larger than m
, and x*y
can potentially overflow. We can do this if m
is less than half of the overflow limit.
A solution is simple: Just perform basic multiplication bit-by-bit for x*y
and do every step modulo m
.
Start with x
and y
less than m
(if either isn't, reduce it first). Write y
in the form a_0 * 2^0 + a_1 * 2^1 + a_2 * 2^2
+ ... , where a_n
is either 0 or 1 (indicating the term is present or not). (Aka, write y
in binary form.) Now we have:
x * (a_0 * 2^0 + a_1 * 2^1 + a_2 * 2^2 + ...) mod m
Distribute x
over each of the terms of y
:
(x * a_0 * 2^0) + (x * a_1 * 2^1) + (x * a_2 * 2^2) + ... mod m
Then use the original multiplication identity: For each term above, multiply x
by 2 mod m
until you reach the desired power of 2 for that term. (Since x < m
and 2 * m < 2^64
, then 2 * x < 2^64
, so we can multiply by 2 without overflowing.) When you are done, add the result for each term mod m
(you can keep a running sum as you go).
None of those operations will exceed 2^64 and thus will not overflow. This will work for any value of m
less than 2^64 / 2 = 2^63 and any integers x
and y
less than m
.
This is not necessarily the fastest way to do it, feel free to find something more efficient. For starters, the smaller m
is compared to the overflow limit, the bigger the radix for the terms we can rewrite y
as.