How can I add and subtract 128 bit integers in C or C++ if my compiler does not support them?

https://stackoverflow.com/questions/741301

09-09-2019
|

Question

I'm writing a compressor for a long stream of 128 bit numbers. I would like to store the numbers as differences -- storing only the difference between the numbers rather than the numbers themselves because I can pack the differences in fewer bytes because they are smaller.

However, for compression then I need to subtract these 128 bit values, and for decompression I need to add these values. Maximum integer size for my compiler is 64 bits wide.

Anyone have any ideas for doing this efficiently?

Solution

If all you need is addition and subtraction, and you already have your 128-bit values in binary form, a library might be handy but isn't strictly necessary. This math is trivial to do yourself.

I don't know what your compiler uses for 64-bit types, so I'll use INT64 and UINT64 for signed and unsigned 64-bit integer quantities.

class Int128
{
public:
    ...
    Int128 operator+(const Int128 & rhs)
    {
        Int128 sum;
        sum.high = high + rhs.high;
        sum.low = low + rhs.low;
        // check for overflow of low 64 bits, add carry to high
        if (sum.low < low)
            ++sum.high;
        return sum;
    }
    Int128 operator-(const Int128 & rhs)
    {
        Int128 difference;
        difference.high = high - rhs.high;
        difference.low = low - rhs.low;
        // check for underflow of low 64 bits, subtract carry to high
        if (difference.low > low)
            --difference.high;
        return difference;
    }

private:
    INT64  high;
    UINT64 low;
};

OTHER TIPS

Take a look at GMP.

#include <stdio.h>
#include <gmp.h>

int main (int argc, char** argv) {
    mpz_t x, y, z;
    char *xs, *ys, *zs;
    int i;
    int base[4] = {2, 8, 10, 16};

    /* setting the value of x in base 10 */
    mpz_init_set_str(x, "100000000000000000000000000000000", 10);

    /* setting the value of y in base 16 */
    mpz_init_set_str(y, "FF", 16);

    /* just initalizing the result variable */
    mpz_init(z);

    mpz_sub(z, x, y);

    for (i = 0; i < 4; i++) {
        xs = mpz_get_str(NULL, base[i], x);
        ys = mpz_get_str(NULL, base[i], y);
        zs = mpz_get_str(NULL, base[i], z);

        /* print all three in base 10 */
        printf("x = %s\ny = %s\nz = %s\n\n", xs, ys, zs);

        free(xs);
        free(ys);
        free(zs);
    }

    return 0;
}

The output is

x = 10011101110001011010110110101000001010110111000010110101100111011111000000100000000000000000000000000000000
y = 11111111
z = 10011101110001011010110110101000001010110111000010110101100111011111000000011111111111111111111111100000001

x = 235613266501267026547370040000000000
y = 377
z = 235613266501267026547370037777777401

x = 100000000000000000000000000000000
y = 255
z = 99999999999999999999999999999745

x = 4ee2d6d415b85acef8100000000
y = ff
z = 4ee2d6d415b85acef80ffffff01

Having stumbled across this relatively old post entirely by accident, I thought it pertinent to elaborate on Volte's previous conjecture for the benefit of inexperienced readers.

Firstly, the signed range of a 128-bit number is -2¹²⁷ to 2¹²⁷-1 and not -2¹²⁷ to 2¹²⁷ as originally stipulated.

Secondly, due to the cyclic nature of finite arithmetic the largest required differential between two 128-bit numbers is -2¹²⁷ to 2¹²⁷-1, which has a storage prerequisite of 128-bits, not 129. Although (2¹²⁷-1) - (-2¹²⁷) = 2¹²⁸-1 which is clearly greater than our maximum 2¹²⁷-1 positive integer, arithmetic overflow always ensures that the nearest distance between any two n-bit numbers always falls within the range 0 to 2ⁿ-1 and thus implicitly -2^n-1 to 2^n-1-1.

In order to clarify, let us first examine how a hypothetical 3-bit processor would implement binary addition. As an example, consider the following table which depicts the absolute unsigned range of a 3-bit integer.

   0 = 000b
   1 = 001b
   2 = 010b
   3 = 011b
   4 = 100b
   5 = 101b
   6 = 110b
   7 = 111b ---> [Cycles back to 000b on overflow]

From the above table it is readily apparent that:

001b(1) + 010b(2) = 011b(3)

It is also apparent that adding any of these numbers with its numeric complement always yields 2ⁿ-1:

010b(2) + 101b([complement of 2] = 5) = 111b(7) = (2³-1)

Due to the cyclic overflow which occurs when the addition of two n-bit numbers results in an (n+1)-bit result, it therefore follows that adding any of these numbers with its numeric complement + 1 will always yield 0:

010b(2) + 110b([complement of 2] + 1) = 000b(0)

Thus we can say that [complement of n] + 1 = -n, so that n + [complement of n] + 1 = n + (-n) = 0. Furthermore, if we now know that n + [complement of n] + 1 = 0, then n + [complement of n - x] + 1 must = n - (n-x) = x.

Applying this to our original 3-bit table yields:

   0 = 000b = [complement of 0] + 1 = 0
   1 = 001b = [complement of 7] + 1 = -7
   2 = 010b = [complement of 6] + 1 = -6
   3 = 011b = [complement of 5] + 1 = -5
   4 = 100b = [complement of 4] + 1 = -4
   5 = 101b = [complement of 3] + 1 = -3
   6 = 110b = [complement of 2] + 1 = -2
   7 = 111b = [complement of 1] + 1 = -1 ---> [Cycles back to 000b on overflow]

Whether the representational abstraction is positive, negative or a combination of both as implied with signed twos-complement arithmetic, we now have 2ⁿ n-bit patterns which can seamlessly serve both positive 0 to 2ⁿ-1 and negative 0 to -(2ⁿ)-1 ranges as and when required. In point of fact, all modern processors employ just such a system in order to implement common ALU circuitry for both addition and subtraction operations. When a CPU encounters an i1 - i2 subtraction instruction, it internally performs a [complement + 1] operation on i2 and subsequently processes the operands through the addition circuitry in order to compute i1 + [complement of i2] + 1. With the exception of an additional carry/sign XOR-gated overflow flag, both signed and unsigned addition, and by implication subtraction, are each implicit.

If we apply the above table to the input sequence [-2^n-1, 2^n-1-1, -2^n-1] as presented in Volte's original reply, we are now able to compute the following n-bit differentials:

diff #1:
   (2^n-1-1) - (-2^n-1) =
   3 - (-4) = 3 + 4 =
   (-1) = 7 = 111b

diff #2:
   (-2^n-1) - (2^n-1-1) =
   (-4) - 3 = (-4) + (5) =
   (-7) = 1 = 001b

Starting with our seed -2^n-1, we are now able to reproduce the original input sequence by applying each of the above differentials sequentially:

   (-2^n-1) + (diff #1) =
   (-4) + 7 = 3 =
   2^n-1-1

   (2^n-1-1) + (diff #2) =
   3 + (-7) = (-4) =
   -2^n-1

You may of course wish to adopt a more philosophical approach to this problem and conjecture as to why 2ⁿ cyclically-sequential numbers would require more than 2ⁿ cyclically-sequential differentials?

Taliadon.

Boost 1.53 now includes multiprecision:

#include <boost/multiprecision/cpp_int.hpp>
#include <iostream>

// Requires Boost 1.53 or higher
// build: g++ text.cpp

int main()
{
    namespace mp = boost::multiprecision;

    mp::uint128_t a = 4294967296;
    mp::uint256_t b(0);
    mp::uint512_t c(0);

    b = a * a;
    c = b * b;

    std::cout << "c: " << c << "\n";
    return 0;
}

Output:

./a.out
c: 340282366920938463463374607431768211456

There is a lot of literature regarding large integer math. You can use one of the libraries freely available (see links) or you can roll your own. Although I should warn you, for anything more complicated than addition and subtraction (and shifts), you'll need to use non-trivial algorithms.

To add and subtract, you can create a class/structure that holds two 64-bit integers. You can use simple school math to do the addition and subtraction. Basically, do what you do with a pencil and paper to add or subtract, with careful consideration to carries/borrows.

Search for large integer. Btw recent versions of VC++, IntelC++ and GCC compilers have 128-bit integer types, although I'm not sure they are as easily accessible as you might like (they are intended to be used with sse2/xmms registers).

TomsFastMath is a bit like GMP (mentioned above), but is public domain, and was designed from the ground up to be extremely fast (it even contains assembly code optimizations for x86, x86-64, ARM, SSE2, PPC32, and AVR32).

Also worth noting: if the goal is merely to improve the compression of a stream of numbers by preprocessing it, then the preprocessed stream doesn't have to be made of exactly arithmetic differences. You can use XOR (^) instead of + and -. The advantage is that a 128-bit XOR is exactly the same as two independent XORs on the 64-bit parts, so it is both simple and efficient.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow