Question

The basic arithmetic a computer chip can do only works on numbers(integers or floating point) of a fixed size.

There are many algorithms that could be extended to work on numbers of arbitrary size (and some require them). Therefore, a way to store and perform arithmetic with arbitrary-size numbers is good to have.

For arbitrary-size integers, the approach usually is to take a list of bytes (which each can store a value in the 0-255 range) and have the final 'big number' (often shorted as bignum) then be base-256, by using each of these bytes as 'digits' (ducentiquinquagintiquinquesexa-its?).

For any non-integer arbitrary-sized real number, there exist two different ways of representation:

  • decimals, which consist of an arbitrary-size 'big number' mantissa and exponent. The number is represented by sign(+1 or -1) * mantissa * pow(base, exponent) (where base is usually 2 or sometimes 10)
  • rationals, which have an arbitrary size 'big number' numerator and denominator. The number is represented by numerator / denominator

In practice, I've found many more languages and libraries to use/support decimal data types than rational data types. An example of this are (SQL) data stores, which have a native DECIMAL data type but not a rational one.

Why is this the case? Why are decimal data types preferred over rational ones?

Was it helpful?

Solution

A quibble

Arbitrary-precision decimals are actually fairly rare. E.g. although you mention SQL in the question, the SQL standard doesn't require implementations to support arbitrary precision. E.g. MS SQL Server only supports up to 38 digits of precision. Other libraries could more accurately be described as supporting variable precision rather than arbitrary precision: e.g. Java's BigDecimal operates within a MathContext which defines the precision of the results of an operation. A true arbitrary-precision implementation wouldn't require you to commit up front to the precision you will need in the future. (Yes, that means it must necessarily be lazy).

Relevance

Why am I making this distinction? Because when you're working with fixed precision (whether limited, as in SQL Server, or variable, as in Java) a series of operations uses a predictable amount of memory. When working with rationals, on the other hand, the memory usage can grow without bound.

This is an important difference which I believe goes a long way to explaining why historically language and library designers have favoured decimals over rationals. Particularly when you look at e.g. SQL, which is a product of an era when memory was much more restricted than today, it makes sense to choose a representation which allows you to bound your memory usage in advance.

Other plausible factors

I don't say that memory bounds are the only factor that influenced design decisions. There are two other plausible factors which come to mind readily:

  • The familiarity of the first generations of computer scientists with decimal tables of logarithms etc.
  • Hardware support. Back in the day, some instruction sets included instructions for operating on binary-coded decimal.

OTHER TIPS

One reason might be simply that programmers are more used to decimal number representations, or that their arbitrary-precision library does not support rationals.

Another reason might be an additional performance penalty:

One central operation you certainly have to do on rational numbers is to reduce numerator and denominator. If you keep computing (addition, multiplication, etc) on rational numbers, the numerator and denominator usually grow fast (e.g. when adding two numbers with no common factors).

Reducing rational numbers requires you to determine the greatest common divisor, which is an expensive operation compared to a simple addition of rational numbers.

In general, a/b + c/d can be computed by (a*d + c*b) / (b*d), that's three integer multiplications and one addition. However, this makes both numerator and denominator large. For instance, 1/3 + 1/6 = 9/18 instead of 1/2. (Of course, these can be made a bit more efficient, but the worst case complexity is the same.)

Thus, you need to compute the gcd. This is an additional computation step. (Although, from a theoretical point of view, the worst case complexity is probably the same as for decimal operations.)

Edit

Some more explanations why I suppose that rational arithmetic might be slower in practice that decimal arithmetic. (Though this should be tested, e.g. using gmp.)

Basic operations (add, multiply) on arbitrary precision integers and decimals have a theoretical complexity of O(log n), that is linear in the number of bits. Thus to add two decimals n, you require O(log n) time.

Rational numbers are represented by two arbitrary precision integers. A first observation one can make is that both numerator and denominator can have the same number of (decimal) digits than its decimal counterpart. (Though this is not true for all numbers, e.g. 1/3, and binary representation is also a different story. But in many cases, that's the case.)

Consider, e.g. 0.000000000000000994030870125414 (30 digits). In rational that is 5040335316639481 / 5070602400912917605986812821504 (16 + 31 digits). Thus, for a simple add, both representations require operations on approx. 30 digits—the decimal add operation is really only a digit-wise add, whereas add in rationals are three multiplications and one add - not counting the gcd.

Note: I am not advocating against rational numbers. It really depends on the application what representation is the most useful.

How do you represent an irrational number like e as a rational number?

You can get arbitrarily close, but that would require extremely huge numerators and denominators.

To avoid overflowing the memory for just one number, you'll have to limit how much precision, or how many places behind the decimal point, you'll like.

This you just do by using decimals.

For most applications, decimals are just efficient and good enough; and the specialized alternatives do exists, if needed.

Rational numbers are not very useful in real life. As soon as you leave the + - * / range, you are in irrational numbers, which per definition - cannot be written as 'a/b'.

Integers give you exact math for + - *; rational give you exact math for only /, and everything else (sqrt, log, sin, etc.) needs to support irrational numbers.

If you find an exact way to store irrational numbers exact (including transcendent irrational), that would be something that probably would be very useful. But that is not possible.

Licensed under: CC-BY-SA with attribution
scroll top