Question

I have been testing the waters of competitive programming and I have already seen this statement mentioned a lot of times:

Print the result modulo 109 + 7

Now I can figure out that this is some way of preventing overflow of digits when dealing with very large numbers. But how and why does it work? I would be grateful if someone could explain the mathematical reasoning behind this.

Was it helpful?

Solution

Many contest questions ask you to compute some very, very large number (say, the number of permutations of an 150-element sequence containing some large number of duplicates). Many programming languages don't natively support arbitrary-precision arithmetic, so in the interest of fairness it makes sense for those contests not to ask you for the exact value. The challenge, then, is the following: how can the contest site know when you have the right answer given that you can't exactly compute it?

One initially appealing option would be to just ask for the answer modulo some large power of two (say, 232 or 264) so that competitors working in languages like C or C++ could just use uint32_t or uint64_ts to do all the computations, letting overflows occur normally, and then submit the results. However, this isn't particularly desirable. Suppose, for example, that the question is the following:

Compute 10,000!

This number is staggeringly huge and is way too big to fit into a 32-bit or 64-bit unsigned integer. However, if you just want to get the answer modulo 232 or 264, you could just use this program:

#include <stdio.h>
int main() {
    puts("0");
}

The reason for this is that 10,000! is the product of at least 5,000 even numbers, so one of its factors is 25,000. Therefore, if you just want the answer modulo 232 or 264, you don't actually have to compute it at all. You can just say that the result is 0 mod 232 or 264.

The problem here is that working modulo 232 or 264 is troublesome if the resulting answer is cleanly divisible by either of those numbers. However, if we work modulo a large prime number, then this trick wouldn't work. As an example, the number 7,897,987 is prime. If you try to compute 10,000! mod 7,897,987, then you can't just say "the answer is 0" because none of the numbers multiplied together in 10,000! are divisors of 7,897,987. You'd actually have to do some work to figure out what this number is modulo that large prime. More generally, working modulo a large prime usually requires you to compute the actual answer modulo that large prime, rather than using number-theoretic tricks to skip all the work entirely.

So why work modulo 1,000,000,007? This number happens to be prime (so it's good to use as a modulus) and it's less than 231 - 1, the largest possible value you can fit in a signed 32-bit integer. The signedness is nice here because in some languages (like Java) there are no unsigned integer types and the default integer type is a 32-bit signed integer. This means that you can work modulo 1,000,000,007 without risking an integer overflow.

To summarize:

  • Working modulo a large prime makes it likely that if your program produces the correct output, it actually did some calculation and did so correctly.
  • Working modulo 1,000,000,007 allows a large number of languages to use their built-in integer types to store and calculate the result.

Hope this helps!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top