Question

before I start, a disclaimer: while I can go around C/C++ code, I am no wizard, nor did I ever did enough programming to call myself a capable programmer with it.

I'm trying to use CRC32C to validate data that is coming into our servers from browser. Currently both implementations use the same code (nodeJS on server), but we would like to switch to hardware implementation(blog post, github repo) (when available) and for that I need a correctly functioning version in the browser.

I tried to go with this implementation (and another, internally develop, but also not-working) but using correct polynom (0x82F63B78 instead of 0xEDB88320, and also 0x1EDC6F41 & 0x8F6E37A0) but no polynom that I used produces the correct output.

Continuing in my research I find a post from Mark Adler which includes a software implementation and decide to try to convert it to Javascript (to the best of my understanding of C).

The result:

function crc32c_table_intel() {
var POLY = 0x82f63b78;
var n, crc, k;
var crc32c_table = gen2darr(8, 256, 0);

for (n = 0; n < 256; n++) {
    crc = n;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;
    crc32c_table[0][n] = crc;
}
for (n = 0; n < 256; n++) {
    crc = crc32c_table[0][n];
    for (k = 1; k < 8; k++) {
        crc = crc32c_table[0][crc & 0xff] ^ (crc >> 8);
        crc32c_table[k][n] = crc;
      _crc_tmptable.push(crc32c_table[k][n]);
    }
}
return crc32c_table;
}


function crc32c_sw(crci, str) {
var len = str.length;
var crc;
var crc32c_table = crc32c_table_intel();

crc = crci ^ 0xffffffff;
for(var next = 0; next < 7; next++) { // was: while (len && ((uintptr_t)next & 7) != 0) {
    crc = crc32c_table[0][(crc ^ str.charCodeAt(next++)) & 0xff] ^ (crc >> 8);
    len--;
}
while (len >= 8) {
    // was: crc ^= *(uint64_t *)next;
    crc ^= str.charCodeAt(next);
    crc = crc32c_table[7][crc & 0xff] ^
          crc32c_table[6][(crc >> 8) & 0xff] ^
          crc32c_table[5][(crc >> 16) & 0xff] ^
          crc32c_table[4][(crc >> 24) & 0xff] ^
          crc32c_table[3][(crc >> 32) & 0xff] ^
          crc32c_table[2][(crc >> 40) & 0xff] ^
          crc32c_table[1][(crc >> 48) & 0xff] ^
          crc32c_table[0][crc >> 56];
    next += 1;
    len -= 1;
}
while (len) {
    // was: crc = crc32c_table[0][(crc ^ *next++) & 0xff] ^ (crc >> 8);
    crc = crc32c_table[0][(crc ^ str.charCodeAt(next++)) & 0xff] ^ (crc >> 8);
    len--;
}
return crc ^ 0xffffffff;
}


// a helper function
function gen2darr( rows, cols, defaultValue){
    var arr = [];
    for(var i=0; i < rows; i++){
            arr.push([]);
            arr[i].push( new Array(cols));
            for(var j=0; j < cols; j++){
                    arr[i][j] = defaultValue;
                    }
            }
    return arr;
}

Still, no luck. No matter what function, what table or what polynom do I use, the results do not match:

SSE4.2: 606105071
JS (example): 1249991249 

Then I go around thinking that it must be something with conversion from Javascript strings to C/C++ data and I see that nodeJS implementation uses UTF8 (https://github.com/Voxer/sse4_crc32/blob/master/src/sse4_crc32.cpp#L56) while Javascript uses UCS-2 encoding.

Now, the questions that I have are these:

  1. Is any of these functions valid? The first seems so, for the one I posted I am not sure if I translated all the bitwise operations correctly
  2. How to go around encoding issues? Is that even an encoding issue as I suspect? Does anyone has any other ideas how to ensure that nodeJS HW implementation and client side implemntation return the same output?

Thanks for any ideas!

Was it helpful?

Solution

See this answer for a compatible software implementation and a fast implementation using the hardware instruction.

You have a few problems there. One is that in Javascript you need to use a logical right shift >>> instead of an arithmetic right shift >>. Second is that you are using charCodeAt, which returns the Unicode value of a character, which may be more than one byte. The CRC algorithm operates on a sequence of bytes, not a sequence of Unicode characters. Third is that you're computing the same table every time -- the table should only be computed once. Last is you're jumping straight to a complicated implementation.

As a simple example, this will compute a CRC-32C in Javascript on an array of values expected to be integers in the range 0..255, i.e. bytes:

function crc32c(crc, bytes) {
  var POLY = 0x82f63b78;
  var n;

  crc ^= 0xffffffff;
  for (n = 0; n < bytes.length; n++) {
    crc ^= bytes[n];
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
    crc = crc & 1 ? (crc >>> 1) ^ POLY : crc >>> 1;
  }
  return crc ^ 0xffffffff;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top