What do you get when you cross a multi-precision integer with a floating-point number?

https://softwareengineering.stackexchange.com/questions/280335

08-10-2020
|

Question

I'm working on adding multi-precision integers to the suite of numeric types in my APL interpreter, but I'm not sure what to do about the odd type-combinations that now arise. I now have the following types:

IMM "atomic" small integer
FIX full-width integer
FLO floating-point double
MPI multi-precision integer

I have a variable which controls which larger type to use for integer overflow, either FLO or MPI. Mixed MPI/IMM/FIX operations, since they're all integers, can simply promote to the widest type and produce a result of that type. And mixed FLO/IMM/FIX operations can follow the same pattern since a double can comfortably accommodate all the values of a 32-bit integer. This covers most of the cases. But it leaves me with these type combinations which don't have an obvious (to me) rule to follow.

FLO {+-*/} MPI
MPI {+-*/} FLO

Having written this out, I suppose there really is an obvious solution (multi-precision floating-point). But I don't want to do that right now. Is there a sensible shortcut I can take (for now)?

As a "worst case" scenario that at least delivers some kind of result, I can implement conversions between these two types. But there's potential loss of data each way.

FLO -> MPI loses fractional part of floating-point number
MPI -> FLO loses precision from integer

Solution

Another option is to add a rational number type. This is stored as two integers, which usually will need to be multi-precision, and can represent any floating point value including the result of an operation between floating point and a multi-precision integer.

This way there is no loss of information. But there is a lot of work involved and should only be done if the user wants it.

OTHER TIPS

As a start, if we begin to consider the actual quantities involved then some subcases might be handled easily. It might also depend upon the operation we want to do.

MPI {+-} FLO
FLO {+-} MPI

If the FLO has a zero fractional part, then it can be converted to MPI losslessly. If it does have a fraction, then we probably want to keep the fraction and we ought to convert the MPI to FLO and cope with the loss of precision somehow.

MPI * FLO
FLO * MPI

Now here, we might want to perform a scaling operation on the MPI, so perhaps a little algebra can wiggle us out a result.

FLO = int + (1/frac)
R = MPI * FLO
  = MPI * (int + (1/frac))
  = (MPI * int) + (MPI/frac)

All of this seems to lead to a user-configurable parameter that selects which type to yield.

Or if the MPI is exactly representable as a double, then it can be converted without fear. Although the result may lose precision if the values are near the representation limits.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange