Determine Position of Most Signifiacntly Set Bit in a Byte

Question 1

Your question is about an efficient way to compute log2 of a value. And because you seem to want a solution that is not limited to the C language I have been slightly lazy and tweaked some C# code I have.

You want to compute log2(x) + 1 and for x = 0 (where log2 is undefined) you define the result as 0 (e.g. you create a special case where log2(0) = -1).

static readonly Byte[] multiplyDeBruijnBitPosition = new Byte[] {
  7, 2, 3, 4,
  6, 1, 5, 0
};

public static Byte Log2Plus1(Byte value) {
  if (value == 0)
    return 0;

  var roundedValue = value;
  roundedValue |= (Byte) (roundedValue >> 1);
  roundedValue |= (Byte) (roundedValue >> 2);
  roundedValue |= (Byte) (roundedValue >> 4);
  var log2 = multiplyDeBruijnBitPosition[((Byte) (roundedValue*0xE3)) >> 5];
  return (Byte) (log2 + 1);
}

This bit twiddling hack is taken from Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup where you can see the equivalent C source code for 32 bit values. This code has been adapted to work on 8 bit values.

However, you may be able to use an operation that gives you the result using a very efficient built-in function (on many CPU's a single instruction like the Bit Scan Reverse is used). An answer to the question Bit twiddling: which bit is set? has some information about this. A quote from the answer provides one possible reason why there is low level support for solving this problem:

Things like this are the core of many O(1) algorithms such as kernel schedulers which need to find the first non-empty queue signified by an array of bits.

Question 2

That was a fun little challenge. I don't know if this one is completely portable since I only have VC++ to test with, and I certainly can't say for sure if it's more efficient than other approaches. This version was coded with a loop but it can be unrolled without too much effort.

static unsigned char check(unsigned char b)
{
  unsigned char r = 8;
  unsigned char sub = 1;
  unsigned char s = 7;
  for (char i = 0; i < 8; i++)
  {
      sub = sub & ((( b & (1 << s)) >> s--) - 1);
      r -= sub;
  }
  return r;
}

Question 3

I'm sure everyone else has long since moved on to other topics but there was something in the back of my mind suggesting that there had to be a more efficient branch-less solution to this than just unrolling the loop in my other posted solution. A quick trip to my copy of Warren put me on the right track: Binary search.

Here's my solution based on that idea:

  Pseudo-code:

  // see if there's a bit set in the upper half   
  if ((b >> 4) != 0)  
  {
      offset = 4;
      b >>= 4;   
  }   
  else
      offset = 0;

  // see if there's a bit set in the upper half of what's left   
  if ((b & 0x0C) != 0)   
  {
    offset += 2;
    b >>= 2;   
  }

  // see if there's a bit set in the upper half of what's left   
  if > ((b & 0x02) != 0)   
  {
    offset++;
    b >>= 1;   
  }

  return b + offset;

Branch-less C++ implementation:

static unsigned char check(unsigned char b)
{    
  unsigned char adj = 4 & ((((unsigned char) - (b >> 4) >> 7) ^ 1) - 1);
  unsigned char offset = adj;
  b >>= adj;
  adj = 2 & (((((unsigned char) - (b & 0x0C)) >> 7) ^ 1) - 1);
  offset += adj;
  b >>= adj;
  adj = 1 & (((((unsigned char) - (b & 0x02)) >> 7) ^ 1) - 1);
  return (b >> adj) + offset + adj;
}

Yes, I know that this is all academic :)

Question 4

It is not possible in plain C. The best I would suggest is the following implementation of check. Despite quite "ugly" I think it runs faster than the ckeck version in the question.

int check(unsigned char b)
{
    if(b&128) return 8;
    if(b&64)  return 7;
    if(b&32)  return 6;
    if(b&16)  return 5;
    if(b&8)   return 4;
    if(b&4)   return 3;
    if(b&2)   return 2;
    if(b&1)   return 1;
              return 0;
}

Question 5

Edit: I found a link to the actual code: http://www.hackersdelight.org/hdcodetxt/nlz.c.txt The algorithm below is named nlz8 in that file. You can choose your favorite hack.

/*
From last comment of: http://stackoverflow.com/a/671826/315052
> Hacker's Delight explains how to correct for the error in 32-bit floats
> in 5-3 Counting Leading 0's. Here's their code, which uses an anonymous
> union to overlap asFloat and asInt: k = k & ~(k >> 1); asFloat =
> (float)k + 0.5f; n = 158 - (asInt >> 23); (and yes, this relies on
> implementation-defined behavior) - Derrick Coetzee Jan 3 '12 at 8:35
*/

unsigned char check (unsigned char b) {
    union {
        float    asFloat;
        int      asInt;
    } u;
    unsigned k = b & ~(b >> 1);
    u.asFloat = (float)k + 0.5f;
    return 32 - (158 - (u.asInt >> 23));
}

Edit -- not exactly sure what the asker means by language independent, but below is the equivalent code in python.

import ctypes

class Anon(ctypes.Union):
    _fields_ = [
        ("asFloat", ctypes.c_float),
        ("asInt", ctypes.c_int)
    ]

def check(b):
    k = int(b) & ~(int(b) >> 1)
    a = Anon(asFloat=(float(k) + float(0.5)))
    return 32 - (158 - (a.asInt >> 23))