how to know the position of the first(left -> right) '1' bit in a char efficiently [duplicate]

StackOverflow https://stackoverflow.com/questions/23083504

  •  04-07-2023
  •  | 
  •  

Question

Just like the title said, I have a char (e.g. 00101011), how can I know the position of first '1' bit in it? (in this case, it is 6, or 5 if we consider the rightmost position is 0).
I know a for-loop shift would be helpful, but is there a more efficient way?

Was it helpful?

Solution

There are some bithacks (as mentioned by Codor) which can do it better than just shifting, but nothing in O(1). Binary search comes to mind.

The only more efficient way I am afraid is a specialised processor instruction like the bsr instruction on x86. Some compilers provide builtins for those instructions. gcc for example provides __builtin_clz for this purpose.

OTHER TIPS

Various approaches for getting the highest 1 bit are discussed here, where also the shifting method you described is mentioned. A detailed answer from there:

Find the log base 2 of an integer with a lookup table

static const char LogTable256[256] = 
{
#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
    -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
    LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
    LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
};

unsigned int v; // 32-bit word to find the log of
unsigned r;     // r will be lg(v)
register unsigned int t, tt; // temporaries

if (tt = v >> 16)
{
  r = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
}
else 
{
  r = (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
}

The lookup table method takes only about 7 operations to find the log of a 32-bit value. If extended for 64-bit quantities, it would take roughly 9 operations. Another operation can be trimmed off by using four tables, with the possible additions incorporated into each. Using int table elements may be faster, depending on your architecture.

The code above is tuned to uniformly distributed output values. If your inputs are evenly distributed across all 32-bit values, then consider using the following:

if (tt = v >> 24) 
{
  r = 24 + LogTable256[tt];
} 
else if (tt = v >> 16) 
{
  r = 16 + LogTable256[tt];
} 
else if (tt = v >> 8) 
{
  r = 8 + LogTable256[tt];
} 
else 
{
  r = LogTable256[v];
}

To initially generate the log table algorithmically:

LogTable256[0] = LogTable256[1] = 0;
for (int i = 2; i < 256; i++) 
{
  LogTable256[i] = 1 + LogTable256[i / 2];
}
LogTable256[0] = -1; // if you want log(0) to return -1

Behdad Esfahbod and I shaved off a fraction of an operation (on average) on May 18, 2005. Yet another fraction of an operation was removed on November 14, 2006 by Emanuel Hoogeveen. The variation that is tuned to evenly distributed input values was suggested by David A. Butterfield on September 19, 2008. Venkat Reddy told me on January 5, 2009 that log(0) should return -1 to indicate an error, so I changed the first entry in the table to that.

First of all, bits in a byte are always enumerated from 0 to 7, where bit 0 is the least significant bit. This way of counting bits is universal.

As for what's most efficient, define "efficient". The most speed-efficient way is to generate a table of 256 bytes at compile time, where each item contains the bit position of the first bit:

#define ALL_ZEROES 0xFF // special case

const uint8_t LOOKUP_TABLE [256] =
{
  ALL_ZEROES, // no 1 found
  0,          // 0000 0001, first 1 at bit 0
  1,          // 0000 0010, first 1 at bit 1
  0,          // 0000 0011, first 1 at bit 0
  2,          // 0000 0100, first 1 at bit 2
  ...
};

And then you get the position by typing:

first_one = LOOKUP_TABLE[my_value];

However, this does consume 256 bytes of ROM memory, just to gain a few extra CPU ticks. So while it is very speed-efficient, it is horribly inefficient memory-wise.

The mentioned method with a for loop and shift is already fast, and it doesn't come with this memory overhead. So in most applications you would probably use the for loop, because it is most efficient in terms of speed + memory consumption together.

It all depends on the nature of the application.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top