Number of bits to represent a number

https://stackoverflow.com/questions/3081271

28-09-2019
|

Question

I'm trying to write a function to return the number of bits a positive integer less that the Javascript limit of (2^53)-1 is. However im being hit by precision problems, and want to avoid big integer libraries.

Method 1:

function bitSize(num)
{
return Math.floor( Math.log(num) / Math.log(2) ) + 1;
}

Pass: bitSize( Math.pow(2, 16) -1 ) = 16
Pass: bitSize( Math.pow(2, 16) ) = 17
Fail (Should be 48): bitSize( Math.pow(2, 48) -1 ) = 49 
Pass: bitSize( Math.pow(2, 48) ) = 49

Method 2:

function bitSize(num)
{
var count = 0;
while(num > 0)
{
    num = num >> 1;
    count++;
}
return count;
}

Pass: bitSize( Math.pow(2, 16) -1 ) = 16
Pass: bitSize( Math.pow(2, 16) ) = 17
Fail (Should be 48): bitSize( Math.pow(2, 48) -1 ) = 1
Fail (Should be 49): bitSize( Math.pow(2, 48) ) = 1

Both methods fail to precision issues I think.

Can anyone suggest an alternative method that will work for numbers between 0 -> 2^53-1

Thanks.

Solution

You can do:

function bitSize(num) {
    return num.toString(2).length;
}

The toString() method of Number takes the radix as an optional argument.

Here are some tests. Works on Chrome, Safari, Opera, and Firefox. No access to IE, sorry.

OTHER TIPS

Bitwise operations will only work reliably in Javascript for "integers" up to 32-bits. To quote from The Complete JavaScript Number Reference:

Bitwise operations are a bit of a hack in Javascript. Since all numbers in Javascript are floating point, and bitwise operators only work on integers, Javascript does a little behind the scenes magic to make it appear bitwise operations are being applied to a 32bit signed integer.

Specifically, Javascript takes the number you are working on and takes the integer portion of the number. It then converts the integer to the most number of bits that number represents, up to 31 bits (1 bit for the sign). So 0 would create a two bit number (1 for the sign, and 1 bit for 0), likewise 1 would create two bits. 2 would create a 3 bit number, 4 would create a 4 bit number, etc…

It's important to realize that you're not guaranteed a 32bit number, for instance running not on zero should, in theory, convert 0 to 4,294,967,295, instead it will return -1 for two reasons, the first being that all numbers are signed in Javascript so "not" always reverses the sign, and second Javascript couldn't make more than one bit from the number zero and not zero becomes one. Therefore ~0=-1.

So bitwise signs in Javascript are up to 32 bits.

As Anurag notes, you should simply use the built-in num.toString(2) in this situation instead, which outputs a minimal length string of ASCII '1's and '0's, which you can simply take the length of.

The ES6 standard brings Math.clz32(), so for numbers in the range of 32 bits you can write:

num_bits = 32 - Math.clz32(0b1000000);

Test it in this snippet:

var input = document.querySelector('input');
var bits = document.querySelector('#bits');

input.oninput = function() {
    var num = parseInt(input.value);
    bits.textContent = 32 - Math.clz32(num);
};

Number (Decimal): <input type="text"><br>
Number of bits: <span id="bits"></span>

On MDN's documentation of Math.clz32 a polyfill is provided:

Math.imul = Math.imul || function(a, b) {
  var ah = (a >>> 16) & 0xffff;
  var al = a & 0xffff;
  var bh = (b >>> 16) & 0xffff;
  var bl = b & 0xffff;
  // the shift by 0 fixes the sign on the high part
  // the final |0 converts the unsigned value into a signed value
  return ((al * bl) + (((ah * bl + al * bh) << 16) >>> 0)|0);
};

Math.clz32 = Math.clz32 || (function () {
  'use strict';

  var table = [
    32, 31,  0, 16,  0, 30,  3,  0, 15,  0,  0,  0, 29, 10,  2,  0,
     0,  0, 12, 14, 21,  0, 19,  0,  0, 28,  0, 25,  0,  9,  1,  0,
    17,  0,  4,   ,  0,  0, 11,  0, 13, 22, 20,  0, 26,  0,  0, 18,
     5,  0,  0, 23,  0, 27,  0,  6,  0, 24,  7,  0,  8,  0,  0,  0]

  // Adapted from an algorithm in Hacker's Delight, page 103.
  return function (x) {
    // Note that the variables may not necessarily be the same.

    // 1. Let n = ToUint32(x).
    var v = Number(x) >>> 0

    // 2. Let p be the number of leading zero bits in the 32-bit binary representation of n.
    v |= v >>> 1
    v |= v >>> 2
    v |= v >>> 4
    v |= v >>> 8
    v |= v >>> 16
    v = table[Math.imul(v, 0x06EB14F9) >>> 26]

    // Return p.
    return v
  }
})();

document.body.textContent = 32 - Math.clz32(0b1000000);

Build a lookup table with the respective boundaries where bits change. You could do this for larger values only and still do smaller ones through the logarithm. It seems to be generally floating-point-related as I can reproduce it in PowerShell here as well.

Late to the party, but I want to compliment trincot's 32-bit answer with a faster, simpler and better supported full 53 bit method.

The following 2 examples will just read/parse and return the float's exponent-value.

For modern browsers that support ES6 ArrayBuffer and DataView (doesn't care about platform's endianness but with less legacy-compatibility):

reqBits4Int = (function(d){ 'use strict';
  return function(n){
    return n && (                         // return 0 if n===0 (instead of 1)
      d.setFloat64(0, n),                 // OR set float to buffer and
      d.getUint16(0) - 16352 >> 4 & 2047  // return float's parsed exponent
    );                                    // Offset 1022<<4=16352; 0b1111111=2047
  };                                      // DataView methods default to big-endian
})( new DataView(new ArrayBuffer(8)) );   // Pass a Buffer's DataView to IFFE

Example for slightly older browsers that support Float64Array and Uint16Array (but no DataView, so endianness is platform dependent and this snippet assumes 'standard' little-endian):

reqBits4Int = (function(){ 'use strict';
  var f = new Float64Array(1),            // one 8-byte element (64bits)
      w = new Uint16Array(f.buffer, 6);   // one 2-byte element (start at byte 6)
  return function(n){ 
    return ( f[0] = n                     // update float array buffer element
                                          // and return 0 if n===0 (instead of 1)
           ) && w[0] - 16352 >> 4 & 2047; // or return float's parsed exponent
  };  //NOTE : assumes little-endian platform
})(); //end IIFE

Both versions above return an positive integer Number representing the maximum bits required to hold the an integer Number that was passed as argument.
They work without error over the full range of [-2⁵³, 2⁵³] , and beyond this, covering the full float range of positive floating point exponents except where rounding already occurred on the input Number (for example 2⁵⁵-1) is stored as 2⁵⁵ (which, obviously, does equal 56 bits).

Explaining the IEEE 754 floating point format is really outside the scope of this answer, but for those with a basic understanding I have included the collapsed snippet below which contains a calculation in table-form from which one can see/explain the logic: effectively we just grab the float's first word (16 MSB's containing the sign and full exponent), subtract the 4-bit-shifted offset and zeroing_offset (saving us 2 operations), shift and mask result as output. 0 is taken care of in the function.

<xmp> PREVIEW of data to be generated: 

Float value :  S_exponent__MMMM :  # -(1022<<4)#### :  #   >> 4     :    & 2047    : Result integer
         -9 :  1100000000100010 :  1000000001000010 :  100000000100 :          100 :     4
         -8 :  1100000000100000 :  1000000001000000 :  100000000100 :          100 :     4
         -7 :  1100000000011100 :  1000000000111100 :  100000000011 :           11 :     3
         -6 :  1100000000011000 :  1000000000111000 :  100000000011 :           11 :     3
         -5 :  1100000000010100 :  1000000000110100 :  100000000011 :           11 :     3
         -4 :  1100000000010000 :  1000000000110000 :  100000000011 :           11 :     3
         -3 :  1100000000001000 :  1000000000101000 :  100000000010 :           10 :     2
         -2 :  1100000000000000 :  1000000000100000 :  100000000010 :           10 :     2
         -1 :  1011111111110000 :  1000000000010000 :  100000000001 :            1 :     1
          0 :                 0 :   -11111111100000 :   -1111111110 :  10000000010 :  1026
          1 :    11111111110000 :             10000 :             1 :            1 :     1
          2 :   100000000000000 :            100000 :            10 :           10 :     2
          3 :   100000000001000 :            101000 :            10 :           10 :     2
          4 :   100000000010000 :            110000 :            11 :           11 :     3
          5 :   100000000010100 :            110100 :            11 :           11 :     3
          6 :   100000000011000 :            111000 :            11 :           11 :     3
          7 :   100000000011100 :            111100 :            11 :           11 :     3
          8 :   100000000100000 :           1000000 :           100 :          100 :     4
          9 :   100000000100010 :           1000010 :           100 :          100 :     4

after 18 the generated list will only show 3 values before and after the exponent change
</xmp>

<script> //requires dataview, if not available see post how to rewrite or just examine example above
firstFloatWord = (function(d){ 
  return function(n){
    return d.setFloat64(0, n), d.getUint16(0);
  };
})( new DataView(new ArrayBuffer(8)) );

function pad(v, p){
  return ('                    '+v).slice(-p);
}

for( var r= '',   i=-18, c=0, t
   ; i < 18014398509481984
   ; i= i>17 && c>=5
      ? (r+='\n', c=0, (i-2)*2-3)
      : (++c, i+1) 
   ){
  r+= pad(i, 19) + ' : '   
    + pad((t=firstFloatWord(i)).toString(2), 17) + ' : '
    + pad((t-=16352).toString(2), 17) + ' : '
    + pad((t>>=4).toString(2), 13) + ' : '
    + pad((t&=2047).toString(2), 12) + ' : '
    + pad(t, 5) + '\n';
}

document.body.innerHTML='<xmp>        Float value :  S_exponent__MMMM :  # -(1022<<4)#### : '
                       + ' #   >> 4     :    & 2047    : Result integer\n' + r + '</xmp>';
</script>

Fallback options:

ECMAScript (javascript) leaves implementers free to choose how to implement the language. So in the wild x-browser world one not only has to deal with rounding-differences but also different algorithms like for example Math.log and Math.log2 etc.
As you already noticed (your method 1), a common example where a log2 (polyfill) might not work is 2⁴⁸ (=49 which is one to much when floored), but that's not the only example.
Certain versions of chrome for example even screw up significantly smaller numbers like: Math.log2(8) = 2.9999999999999996 (one to less when floored).
Read more about that in this stackoverflow Q/A:Math.log2 precision has changed in Chrome.
This means that we can not know when to floor or ceil our logarithmic result (or easily predict when we are already one off before rounding).

So you could count how often you can divide the input number by 2 before it is smaller than 1 in a loop (much like your counted 32-bit shift method 2):

function reqBits4Int(n){ for(var c=0; n>=1; ++c, n/=2); return c }

But this is rather brute force (and might also get you into rounding-problems). You could improve this using some divide and conquering and while you're at it, unroll the loop:

function reqBits4Int(n){ 'use strict';
  var r= 4294967295 < n  ? ( n= n/4294967296 >>>   0,      32 ) : 0 ;
              65535 < n && (               n >>>= 16,  r+= 16 );
                255 < n && (               n  >>=  8,  r+=  8 );
                 15 < n && (               n  >>=  4,  r+=  4 );
  //              3 < n && (               n  >>=  2,  r+=  2 ); 
  //              1 < n && (               n  >>=  1,  r+=  1 ); 
  // return r + n;

  // OR using a lookup number instead of the lines comented out above
  // position: 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0  = 16 values
  //   binary: 11 11 11 11 11 11 11 11 10 10 10 10 01 01 00 00  = 4294945360 dec

  return (n && (4294945360 >>> n * 2 & 3) + 1) + r;
}

^{Formatting is intended to help understand the algorithm. It will minify quite nicely tough!}
This has a positive integer range of [0, 2⁵³] without error (and up to 2⁶⁴ with the same predictable rounding-'error').

Alternatively you could try some other (repeated, for inputvalues larger than 32 bit) bithacks.

The simplest and shortest (but potentially slower, compared to the calculation snippet above) is to stringify the number and count the resulting string-length as in Anurag's answer, essentially: return n && n.toString(2).length; (assuming browsers can give a result of up to (at least) 53 bits).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow