Floats vs rationals in arbitrary precision fractional arithmetic (C/C++)

https://stackoverflow.com/questions/11798640

24-06-2021
|

Question

Since there are two ways of implementing an AP fractional number, one is to emulate the storage and behavior of the double data type, only with more bytes, and the other is to use an existing integer APA implementation for representing a fractional number as a rational i.e. as a pair of integers, numerator and denominator, which of the two ways are more likely to deliver efficient arithmetic in terms of performance? (Memory usage is really of minor concern.)

I'm aware of the existing C/C++ libraries, some of which offer fractional APA with "floats" and other with rationals (none of them features fixed-point APA, however) and of course I could benchmark a library that relies on "float" implementation against one that makes use of rational implementation, but the results would largely depend on implementation details of those particular libraries I would have to choose randomly from the nearly ten available ones. So it's more theoretical pros and cons of the two approaches that I'm interested in (or three if take into consideration fixed-point APA).

Solution

The question is what you mean by arbitrary precision that you mention in the title. Does it mean "arbitrary, but pre-determined at compile-time and fixed at run-time"? Or does it mean "infinite, i.e. extendable at run-time to represent any rational number"?

In the former case (precision customizable at compile-time, but fixed afterwards) I'd say that one of the most efficient solutions would actually be fixed-point arithmetic (i.e. none of the two you mentioned).

Firstly, fixed-point arithmetic does not require any dedicated library for basic arithmetic operations. It is just a concept overlaid over integer arithmetic. This means that if you really need a lot of digits after the dot, you can take any big-integer library, multiply all your data, say, by 2^64 and you basically immediately get fixed-point arithmetic with 64 binary digits after the dot (at least as long as arithmetic operations are concerned, with some extra adjustments for multiplication and division). This is typically significantly more efficient than floating-point or rational representations.

Note also that in many practical applications multiplication operations are often accompanied by division operations (as in x = y * a / b) that "compensate" for each other, meaning that often it is unnecessary to perform any adjustments for such multiplications and divisions. This also contributes to efficiency of fixed-point arithmetic.

Secondly, fixed-point arithmetic provides uniform precision across the entire range. This is not true for either floating-point or rational representations, which in some applications could be a significant drawback for the latter two approaches (or a benefit, depending on what you need).

So, again, why are you considering floating-point and rational representations only. Is there something that prevents you from considering fixed-point representation?

OTHER TIPS

Since no one else seemed to mention this, rationals and floats represent different sets of numbers. The value 1/3 can be represented precisely with a rational, but not a float. Even an arbitrary precision float would take infinitely many mantissa bits to represent a repeating decimal like 1/3. This is because a float is effectively like a rational but where the denominator is constrained to be a power of 2. An arbitrary precision rational can represent everything that an arbitrary precision float can and more, because the denominator can be any integer instead of just powers of 2. (That is, unless I've horribly misunderstood how arbitrary precision floats are implemented.)

This is in response to your prompt for theoretical pros and cons.

I know you didn't ask about memory usage, but here's a theoretical comparison in case anyone else is interested. Rationals, as mentioned above, specialize in numbers that can be represented simply in fractional notation, like 1/3 or 492113/203233, and floats specialize in numbers that are simple to represent in scientific notation with powers of 2, like 5*2^45 or 91537*2^203233. The amount of ascii typing needed to represent the numbers in their respective human-readable form is proportional to their memory usage.

Please correct me in the comments if I've gotten any of this wrong.

Either way, you'll need multiplication of arbitrary size integers. This will be the dominant factor in your performance since its complexity is worse than O(n*log(n)). Things like aligning operands, and adding or subtracting large integers is O(n), so we'll neglect those.

For simple addition and subtraction, you need no multiplications for floats^* and 3 multiplications for rationals. Floats win hands down.

For multiplication, you need one multiplication for floats and 2 multiplications for rational numbers. Floats have the edge.

Division is a little bit more complex, and rationals might win out here, but it's by no means a certainty. I'd say it's a draw.

So overall, IMHO, the fact that addition is at least O(n*log(n)) for rationals and O(n) for floats clearly gives the win to a floating-point representation.

^*It is possible that you might need one multiplication to perform addition if your exponent base and your digit base are different. Otherwise, if you use a power of 2 as your base, then aligning the operands takes a bit shift. If you don't use a power of two, then you may also have to do a multiplication by a single digit, which is also an O(n) operation.

You are effectively asking the question: "I need to participate in a race with my chosen animal. Should I choose a turtle or a snail ?".

The first proposal "emulating double" sounds like staggered precision: using an array of doubles of which the sum is the defined number. There is a paper from Douglas M. Priest "Algorithms for Arbitrary Precision Floating Point Arithmetic" which describes how to implement this arithmetic. I implemented this and my experience is very bad: The necessary overhead to make this run drops the performance 100-1000 times ! The other method of using fractionals has severe disadvantages, too: You need to implement gcd and kgv and unfortunately every prime in your numerator or denominator has a good chance to blow up your numbers and kill your performance.

So from my experience they are the worst choices one can made for performance.

I recommend the use of the MPFR library which is one of the fastest AP packages in C and C++.

Rational numbers don't give arbitrary precision, but rather the exact answer. They are, however, more expensive in terms of storage and certain operations with them become costly and some operations are not allowed at all, e.g. taking square roots, since they do not necessarily yield a rational answer.

Personally, I think in your case AP floats would be more appropriate.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow