Question

I am trying to write a C function that performs the following calculation at runtime:

Numerator/Denominator

where:

Numerator is a prior calculation result, is always positive, and is greater than the Denominator

and,

Denominator is such that (1 <= Denominator <= 64).

The runtime calculation must be fast, i.e. fewest cycles, so the division operator is out of the question. I have looked at recursive subtraction and bitwise long division, but I am trying to find another solution.

Any Help?

Was it helpful?

Solution

Here's one idea, which uses one multiplication and one shift, so it'll be faster than a division on most systems. Since your numerators top out at 768,000,000 ~= 30 bits, we don't have much room left in a 32-bit word, so we'll have to use 64-bit multiplication.

The main idea is to take advantage of the fact that:

x / y == (x * k) / (y * k)

and that dividing by a power of 2 is a simple, fast bit shift.

So to pick a particular example, assume x = 700,000,000 and y = 47 (so the correct quotient is 14,893,617). To avoid rounding errors, our shift needs to be approximately the size of our largest possible numerator - 30 bits. Find the value of k that gives the closest approximation to y * k = 2^30, which is k = 22845571 in this case. Then x * k = 0x38D08C4CE6F500. Shifting this by 30 bits gives 0xE34231 = 14,893,617, our expected quotient. It's possible you may need to add 1-2 more bits for some combinations of numerator/denominator for rounding purposes, unless being off by 1 in your quotient is acceptable.

The exercise then becomes creating a lookup table with the right multipliers for each of the possible denominators.

EDIT: as pointed out in a comment below, choosing k = (2^30 + y - 1) / y should give better and more consistent results than simply k = round(2^30 / y).

OTHER TIPS

Big @ss table would work for small numbers:

unsigned int divTable[kMaxNumerator][64] = {...}

Where you put the values of each possible divide in there. Not very practical above a certain sizes, but it does work for contained cases and was a common solution for texture mapping way back in the day :) Then I read your comments and see that you are in the 768,000,000 range and this because completely impractical unless you can handle quite a bit of precision loss.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top