Regarding bit masking in C. Why (~(~0 << N)) is preferred than ((1 << N) -1)?

https://stackoverflow.com/questions/7659503

c
bitmask

07-02-2021
|

Question

I do know that ~0 will evaluate the maximum word sized bit 1s (and thus takes caring of portability), but I am still not getting why ((1 << N) - 1) is discouraged?

Please share if you used the second form and got into any trouble.

Solution

Look at these lines:

1. printf("%X", ~(~0 << 31) );
2. printf("%X", (1 << 31) - 1 );

Line 1 compiles and behaves like expected.

Line 2 gives the warning integer overflow in expression.

This is because 1 << 31 is treated by default as a signed int, so 1 << 31 = -2147483648, which is the smallest possible integer.

As a result, resting 1 causes an overflow.

OTHER TIPS

The first form is definitely not preferred, and I would go so far as to say it should never be used. On a ones complement system that does not support negative zero, ~0 may very well be a trap representation and thus invoke UB when used.

On the other hand, 1<<31 is also UB, assuming int is 32-bit, since it overflows.

If you really mean 31 as a constant, 0x7fffffff is the simplest and most correct way to write your mask. If you want all but the sign bit of an int, INT_MAX is the simplest and most correct way to write your mask.

As long as you know the bitshift will not overflow, (1<<n)-1 is the correct way to make a mask with the lowest n bits set. It may be preferable to use (1ULL<<n)-1 followed by a cast or implicit conversion in order not to have to worry about signedness issues and overflow in the shift.

But whatever you do, don't use the ~ operator with signed integers. Ever.

I would discourage both, shift or complement operations on signed values is simply a bad idea. Bit patterns should always be produced on unsigned types and (if even necessary) then transposed to the signed counter parts. Then using the primitive types is also no so good as an idea because usually on bit patterns you should control the the number of bits that you are handling.

So I'd always do something like

-UINT32_C(1)
~UINT32_C(0)

which are completely equivalent and at the end this comes just to use UINT32_MAX and Co.

Shift is only necessary in cases you don't shift fully, something like

(UINT32_C(1) << N) - UINT32_C(1)

I would not prefer one to another, but I've seen many bugs with (1<<N) where the value had to be 64-bit but "1" was 32-bit (ints were 32-bit) and the result was wrong for N>=31. 1ULL instead of 1 would fix it. That's one danger of such shifts.

Also, shifts of ints by CHAR_BIT*sizeof(int) or more positions (similarly for long long's (which are often 64-bit) by CHAR_BIT*sizeof(long long) or more positions) aren't defined. Because of that it may be safer to shift right like this: ~0u>>(CHAR_BIT*sizeof(int)-N), but in this case N can't be 0.

EDIT: corrected a stupid error; and noted possible overflow problems.

I have never heard that one form is preferred over the other. Both forms are evaluated at compile time. I always use the second form, and I've never gotten into any trouble. Both forms are perfectly clear to the reader.

Other answers noted the possibility of overflow in the second form.

I see little to choose between them.

Why Discouraged
~0 is a single cycle operation and hence faster ((1<first do a shift and then a subtraction which is an arithmetic operation. so due to subtraction it will consume a lot of cycles and hence unnecessary overhead.

More
more over, when you do ((1 << N)-1) or ((M << N)-1) is same, assuming N refers to M's size in bits because it will flush all the bits. here 1 is integer, typically 32 bit on almost all the present platforms 32/64 bit, so N can be assumed 32.

The result will however not same if you typecast 1 to long and do (((long)1 << 32) -1). here you need to use 64 in place of 32, 64 being the size of long in bits.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow