Need help understanding “getbits()” method in Chapter 2 of K&R C

https://stackoverflow.com/questions/197614

10-07-2019
|

Question

In chapter 2, the section on bitwise operators (section 2.9), I'm having trouble understanding how one of the sample methods works.

Here's the method provided:

unsigned int getbits(unsigned int x, int p, int n) {
    return (x >> (p + 1 - n)) & ~(~0 << n);
}

The idea is that, for the given number x, it will return the n bits starting at position p, counting from the right (with the farthest right bit being position 0). Given the following main() method:

int main(void) {
    int x = 0xF994, p = 4, n = 3;
    int z = getbits(x, p, n);
    printf("getbits(%u (%x), %d, %d) = %u (%X)\n", x, x, p, n, z, z);

    return 0;
}

The output is:

getbits(63892 (f994), 4, 3) = 5 (5)

I get portions of this, but am having trouble with the "big picture," mostly because of the bits (no pun intended) that I don't understand.

The part I'm specifically having issues with is the complements piece: ~(~0 << n). I think I get the first part, dealing with x; it's this part (and then the mask) that I'm struggling with -- and how it all comes together to actually retrieve those bits. (Which I've verified it is doing, both with code and checking my results using calc.exe -- thank God it has a binary view!)

Any help?

Solution

Let's use 16 bits for our example. In that case, ~0 is equal to

1111111111111111

When we left-shift this n bits (3 in your case), we get:

1111111111111000

because the 1s at the left are discarded and 0s are fed in at the right. Then re-complementing it gives:

0000000000000111

so it's just a clever way to get n 1-bits in the least significant part of the number.

The "x bit" you describe has shifted the given number (f994) right far enough so that the least significant 3 bits are the ones you want. In this example, the bits you're requesting are surrounded by '.' characters.

ff94             11111111100.101.00  # original number
>> p+1-n     [2] 0011111111100.101.  # shift desired bits to right
& ~(~0 << n) [7] 0000000000000.101.  # clear all the other (left) bits

And there you have your bits. Ta da !!

OTHER TIPS

I would say the best thing to do is to do a problem out by hand, that way you'll understand how it works.

Here is what I did using an 8-bit unsigned int.

Our number is 75 we want the 4 bits starting from position 6. the call for the function would be getbits(75,6,4);
75 in binary is 0100 1011
So we create a mask that is 4 bits long starting with the lowest order bit this is done as such.

~0 = 1111 1111
<<4 = 1111 0000
~ = 0000 1111

Okay we got our mask.

Now, we push the bits we want out of the number into the lowest order bits so we shift binary 75 by 6+1-4=3.

0100 1011 >>3 0000 1001

Now we have a mask of the correct number of bits in the low order and the bits we want out of the original number in the low order.

so we & them

  0000 1001 

& 0000 1111
============

  0000 1001

so the answer is decimal 9.

Note: the higher order nibble just happens to be all zeros, making the masking redundant in this case but it could have been anything depending on the value of the number we started with.

~(~0 << n) creates a mask that will have the n right-most bits turned on.

0
   0000000000000000
~0
   1111111111111111
~0 << 4
   1111111111110000
~(~0 << 4)
   0000000000001111

ANDing the result with something else will return what's in those n bits.

Edit: I wanted to point out this programmer's calculator I've been using forever: AnalogX PCalc.

Nobody mentioned it yet, but in ANSI C ~0 << n causes undefined behaviour.

This is because ~0 is a negative number and left-shifting negative numbers is undefined.

Reference: C11 6.5.7/4 (earlier versions had similar text)

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. [...] If E1 has a signed type and nonnegative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

In K&R C this code would have relied on the particular class of system that K&R developed on, naively shifting 1 bits off the left when performing left-shift of a signed number (and this code also relies on 2's complement representation), but some other systems don't share those properties so the C standardization process did not define this behaviour.

So this example is really only interesting as a historical curiosity, it should not be used in any real code since 1989 (if not earlier).

Using the example: int x = 0xF994, p = 4, n = 3; int z = getbits(x, p, n);

and focusing on this set of operations ~(~0 << n)

for any bit set (10010011 etc) you want to generate a "mask" that pulls only the bits you want to see. So 10010011 or 0x03, I'm interested in xxxxx011. What is the mask that will extract that set ? 00000111 Now I want to be sizeof int independent, I'll let the machine do the work i.e. start with 0 for a byte machine it's 0x00 for a word machine it's 0x0000 etc. 64 bit machine would represent by 64 bits or 0x0000000000000000

Now apply "not" (~0) and get 11111111
shift right (<<) by n and get 11111000
and "not" that and get 00000111

so 10010011 & 00000111 = 00000011
You remember how boolean operations work ?

In ANSI C ~0 >> n causes undefined behavior

// the post about left shifting causing a problem is wrong.

unsigned char m,l;

m = ~0 >> 4; is producing 255 and its equal to ~0 but,

m = ~0; l = m >> 4; is producing correct value 15 same as:

m = 255 >> 4;

there is no problem with left shifting negative ~0 << whatsoever

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow