Question

this is sort of a follow up on some previous questions on bit manipulation. I modified the code from this site to enumerate strings with K of N bits set (x is the current int64_t with K bits set, and at the end of this code it is the lexicographically next integer with K bits set):

int64_t b, t, c, m, r,z;
b = x & -x;
t = x + b;
c = x^t;
// was m = (c >> 2)/b per link
z = __builtin_ctz(x);
m = c >> 2+z;
x = t|m;

The modification using __builtin_ctz() works fine as long as the least significant one bit is in the lower DWORD of x, but if is not, it totally breaks. This can be seen with the following code:

for(int i=0; i<64; i++) printf("i=%i, ctz=%i\n", i, __builtin_ctz(1UL << i));

which prints for GCC version 4.4.7:

i=0, ctz=0
i=1, ctz=1
i=2, ctz=2

...

i=30, ctz=30
i=31, ctz=31
i=32, ctz=0

or for icc version 14.0.0 something similar (except i>32 gives random results, not zero). Using division instead of shifting by 2+z works in both cases, but it's about 5x slower on my Sandy Bridge Xeon. Are there other intrinsics I should be using for 64-bit, or will I have to do some inline assembler?

Thanks!

Was it helpful?

Solution

__builtin_ctz takes arguments of type unsigned int, which is 32-bits on most platforms.

If long is 64 bits, you can use __builtin_ctzl which takes unsigned long. Or you can use __builtin_ctzll which takes unsigned long long - In this case you should use 1ULL << i instead of 1UL << i.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top