Question

I'm trying to multiply two 128-bit integers in C.

Here is my algorithm:

Split the two 128-bit sequences into S1 and S2.

Then split S1 into S11 (front/higher half) and S12 (back/lower half) and split S2 into S21 (front/higher half) and S22 (back/lower half).

Multiply S12 by S22... = S1222.

Multiply S11 by S21... = S1121 and then bit shift it by multiplying it by 2^128

Combine S1222 and S1121 to be the front and back halves of a new array. Let's call it "Array1". The new array's length is twice as long as S1.

Then we have to multiply S12 by S21 and S11 by S22. I have multiplied these two to get S1221 and S1122, respectively (and bit-shifted them accordingly). Now I have to add them to Array1. This is the part where I'm asking for help. I'm not sure how to add these, one by one, to Array1. Keep in mind there may be a carry of 1 as you go bit by bit from 3/4 of Array1 to 1/4 of Array1, as this is the span where S1221 and S1122 need to be added.

My question is: How do I add dstM1 and dstM2 to the array d, which is already filled?

Was it helpful?

Solution 3

Summarizing your question: How can you add two arrays of (unsigned) integers propagating the carry.

uint16_t foo[4];  // 0000 aaaa FFFF cccc
uint16_t bar[4];  // dddd eeee FFFF 0000

The good point is that 'FFFF+FFFF+1' is simply (1)FFFF. Thus the carry can always be added in each word without producing an extra carry (as if the sum could be 20000).

Making a temporary sum: sum = foo[3] + bar[3] + carry; with carry being initially 0, either this sum produces a new carry, or not.

  • Carry is produced from (A+B), if (A+B) < A
  • When summing (A+B+c), the carry is produced if ((A + c) < A) || (((A + c) + B) < B)

Another possibility is to calculate "multi-bit carry" by summing up several terms in columns, which occurs often in bignum multiplications:

            AAAA BBBB CCCC
       DDDD EEEE FFFF ....
  GGGG HHHH IIII .... ....
--------------------------
  col4 col3 col2 col1 col0

Now each column produces 32-bit or 64-bit result and a carry that doesn't necessarily fit a single bit.

uint32_t sum_low = carry_in_from_previous_column;
uint32_t sum_high = 0;

for (i = 0; i < N; i++) {
     sum_low += matrix[i][column] & 0xffff;
     sum_high += matrix[i][column] >> 16;
}
sum_high += sum_low >> 16;    // add the multibit half carry

result = (sum_low & 0xffff) | (sum_high << 16);
carry_out = sum_high >> 16;

OTHER TIPS

If you're on gcc or clang you can use __int128 and unsigned __int128 directly.

You are stuck in an infinite loop because i += 1/32 is the same as i += 0.

Also: note:memcpy(&d[3l/2-i], dstM1, 1/8); is memcpy(&d[1-i], dstM1, 0);

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top