Using GMP to calculate very large Integers from formulas

https://stackoverflow.com/questions/18333506

25-06-2022
|

Question

I was having a hard time to figure out how to deal with a problem I encountered. As a part of a complex formula, I need to calculate a part that quickly overflows double, i.e. results get up to ~ 1.59*10^(1331) (calculated with mathematica). Of course this is out of the range of double. Then I was thinking of using long double, which on my linux system with gcc 4.6.3 is 16byte.

1) double precision (8byte) has a possible range of up to 10^(308). Am I right in saying, that long double then increases the actual precision, but not the possible numeric range of values? I remember that I heard that it can be either or, depending on the system and the compiler. Is that true ? At least I still get NaN when I try to calculate my values with long double.

2.) I was then looking for a way to actually calculate these results and I found the GNU gmp. I heard that you can represent very large integers and I thought this might help. However, reading the documentation, it seems that with

 mpz_t x;
 mpz_init(x);
 mpz_set_*(x,#);

I can assign values to gmp integer data types, but in order to do that, I can "only" choose to assign values that can be represented by built-in data types like double or (u/s)int etc. All I found for how to assign really huge numbers was to use mpz_set_str() to assign the number from a string. How would I assign a number that is the result of a complex calculation ? Simply speaking, formulas look like this:

long double res1,res2=0.0;
int a,b;
a=780;
b=741;
float d,d1,o,s; // can be values in [0.01,100]

res1=(2*(pow(b,2)*pow(E,b*(o + s))*(pow(d1,2) + pow(E,a*s)*(-1 + pow(E,a*o)) + pow(d,2)*(-1 + pow(E,a*s))) + pow(a,2)*pow(E,a*(o + s))*(pow(E,b*s)*(pow(E,b*o) + (-1 + d)*(1 + d + b*o)) + (-d + d1)*(d + d1 + b*o + b*d*s)) - a*b*(pow(d1,2)*(pow(E,a*(o + s)) + pow(E,b*(o + s))) + pow(E,a*o + (a + b)*s)*(-2 + 2*pow(E,b*o) - b*o) + d1*pow(E,a*s)*(-pow(E,b*o) + pow(E,a*o)*(1 + b*o)) + pow(d,2)*pow(E,a*s)*(-pow(E,b*o) + pow(E,b*(o + s)) + pow(E,a*o)*(-1 + pow(E,b*s) - b*s)) +  d*(-(d1*pow(E,b*(o + s))) + (1 + d1)*pow(E,b*o + a*s) - pow(E,a*s + b*(o + s)) + pow(E,b*s + a*(o + s))*(1 + b*o) + pow(E,a*(o + s))*(-1 - b*o + b*d1*s)))))......;

res2 will also be of this kind, and in the end I need to calculate res1/res2, which usually becomes a very small number.

I was thinking of splitting the formulas and adding terms to mpg_z in order to not get out of double range for each term, but as the formulas are so long and complex, that this is almost impossible.

So in summary, the problem is, that my intermediate results can get so huge that no data type is capable of storing them, so I cannot assign it to mpz and get rid of this problem.

I am aware that I want to calculate a double value and actually use mpz_t for integers. As far as I understand this is the only way to store such big data, since mpf_t can only handle float type. To be honest, there is still confusion on my side about the representation in gmp.

Any ideas how to approach this ?

Solution

Question 1 Long double allows to handle larger numbers than double (both in the exponent and in the significand precision). But you have to think that your 1e308 order of magnitude doesn't mean anything if your goal is to store large integers; you only have to care about the size of the significand precisions, either 52/53 bits (double) or 64 bits (x86 extended precision). If you try to use it with larger integers, you will have a correct order of magnitude but the exact value will be lost (when computing with integer numbers, people generally care a little more about this than when playing with approximate numbers).

Question 2 Using GMP is a nice choice. Other libraries also exist; for smaller values I use a lot the libqd which has an extended fixed precision and are very quick but this will not be enough for your own problem. Now your question is about setting the values:

using a string version of the number is generally a bad idea (you should keep that only for input/output purposes); it is a slow operation involving basis conversions and digit by digit treatment
doing as much as you can with the GMP types is the way to go (unless you really care about speed and have a full control over the expected range for the values to be computed with native types
if your formula is too long, I can't help much. But using GMP isn't that difficult, can't you really convert your formula? is there some logic in your formula that you could embed in a loop? maybe you could write a quick and dirty python script for converting your formula to a piece of C code using GMP?

Now I don't fully understand why you want to use mpz_t rather than mpf_t. This type implements arbitrary long floating numbers; did you notice you can set the precision with mpf_set_default_prec?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow