Question

I am looking at solution for the set bit count problem (given a binary number, how to efficiently count how many bits are set).

Here, http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive, I have found some methods.

What about the lookup table method? I dont understand what properties of binary representation / number make it work.

static const unsigned char BitsSetTable256[256] = 
{
#   define B2(n) n,     n+1,     n+1,     n+2
#   define B4(n) B2(n), B2(n+1), B2(n+1), B2(n+2)
#   define B6(n) B4(n), B4(n+1), B4(n+1), B4(n+2)
   B6(0), B6(1), B6(1), B6(2)
};

unsigned int v; // count the number of bits set in 32-bit value v
unsigned int c; // c is the total bits set in v

// Option 1:
c = BitsSetTable256[v & 0xff] + 
   BitsSetTable256[(v >> 8) & 0xff] + 
   BitsSetTable256[(v >> 16) & 0xff] + 
   BitsSetTable256[v >> 24]; 

// Option 2:
unsigned char * p = (unsigned char *) &v;
c = BitsSetTable256[p[0]] + 
    BitsSetTable256[p[1]] + 
    BitsSetTable256[p[2]] + 
    BitsSetTable256[p[3]];


// To initially generate the table algorithmically:
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
   BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
}

In particular, I dont understand the BitsSetTable256 definition at first. Why define these quantities B2, B4,... ? it seems to me that they are not used afterwards.

Could you hint at further doc on binary representation?

Thanks!

Était-ce utile?

La solution

The definitions are to form the table by patterns. They are recursive macros, B6 uses B4 and B4 uses B2. B6(0) will get broken into:

B4(0), B4(1), B4(1), B4(2)

B4(0) will get broken into:

0, 1, 1, 2

The first few numbers of the sequence will be:

// 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11
   0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3

As you can see, these are the number of bits set for each index in the table.

The rest of the algorithm is that you are breaking the number into 8-bit chunks and summing the number of bits set in each chunk, that's what these lines are about (you use either option 1 or option 2 at your liking, not both):

// Option 1:
c = BitsSetTable256[v & 0xff] + 
    BitsSetTable256[(v >> 8) & 0xff] + 
    BitsSetTable256[(v >> 16) & 0xff] + 
    BitsSetTable256[v >> 24]; 

// Option 2:
unsigned char * p = (unsigned char *) &v;
c = BitsSetTable256[p[0]] + 
    BitsSetTable256[p[1]] + 
    BitsSetTable256[p[2]] + 
    BitsSetTable256[p[3]];

The code at the bottom:

// To initially generate the table algorithmically:
BitsSetTable256[0] = 0;
for (int i = 0; i < 256; i++)
{
   BitsSetTable256[i] = (i & 1) + BitsSetTable256[i / 2];
}

Is a different way of generating the BitsSetTable256. It generates the table at runtime instead of at compile-time (which is what the macro definition does.

P.S. If you're targeting recent enough (SSE4) x86 you can use POPCNT instruction.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top