crc16 algorithm from C++ to bash
Question
I am trying to implement a CRC16 checksum in bash. I'm porting from an existing piece of C++ code. I'm almost there, but I am getting different answers.
I don't quite see why the checksums between the C++ code and the bash script are different.
Another set of eyes would be a big help.
Here is the C++ code:
uint16_t Encoder::checksum(std::string thestring)
{
uint8_t d, e, f;
uint16_t c, r, crccalc;
c = 0xffff;
for (unsigned int i = 0; i < thestring.length(); i++)
{
d = thestring[i];
e = c ^ d;
f = e ^ (e << 4);
r = (c >> 8) ^ (f << 8) ^ (f << 3) ^ (f >> 4);
c = r;
}
c ^= 0xffff;
crccalc = c;
return crccalc;
}
And here is my bash code:
function calc_crc16()
{
string=$1
while read -d "" -n 1 ; do astring+=( "$reply" ) ; done <<< "$string"
cnt=${#astring[@]}
c=0xffff
for ((x=0;x<$cnt;x++)); do
char=${astring[$x]}
e=$(($c ^ $char))
s=$(($e << 4))
f=$(($e ^ $s))
t1=$(($c >> 8))
t2=$(($f << 8))
t3=$(($f << 3))
t4=$(($f >> 4))
r1=$(($t1 ^ $t2 ^ $t3 ^ $t4))
c=$r1
done
c=$c ^ 0xffff
echo "checksum = $c"
}
Is it going to have something to do with the size of the ints? I'm guessing there's not much I can do about that in bash.
I'm getting an actual number, but it doesn't match the C++, which I know works correctly. does anyone see anything where I may be screwing things up?
Solution 3
ok. with Sorpigal's help, i've got a working version.
i suspect this could all be done within an awk script, which may run a lot faster. i may try that next.
thank you all for the help. i don't mean to steal the solution here, but i worked on it, and figure it is worth putting up.
anyway, here is a working version:
function calc_crc16()
{
while read -r -d "" -n 1 ; do astring+=( "$REPLY" ) ; done <<< "$1"
cnt=${#1}
c=65535
for ((x=0;x<$cnt;x++)); do
char=$(printf '%d' \'"${1:$x:1}")
e=$(((c ^ char) & 0x00FF))
s=$(((e << 4) & 0x00FF))
f=$(((e ^ s) & 0x00FF))
r1=$(((c >> 8) ^ (f << 8) ^ (f << 3) ^ (f >> 4)))
c=$r1
done
c=$((c ^ 0xffff))
echo "checksum = $c"
}
OTHER TIPS
First problem is near the top
while read -d "" -n 1 ; do astring+=( "$reply" ) ; done <<< "$string"
$reply
is wrong, since you didn't specify a variable name for read the name is $REPLY
.
Next error is at the end
c=$c ^ 0xffff
This should be
c=$(($c ^ 0xffff))
At least this way it will run without errors, correctness and appropriateness is something else.
Correctness issues: What if the input string has a space? This will break horribly. Always quote variable exapnsions
Change
char=${astring[$x]}
to
char="${astring[$x]}"
Strangely this rule is different inside $(())
constructs. Your bit operations should reference variables without any $
in these cases
e=$(( c ^ char ))
s=$(( e << 4 ))
f=$(( e ^ s ))
t1=$(( c >> 8 ))
t2=$(( f << 8 ))
t3=$(( f << 3 ))
t4=$(( f >> 4 ))
r1=$(( t1 ^ t2 ^ t3 ^ t4))
And later
c=$(( c ^ 0xffff ))
This will cause variables to be expanded and whitespace not to blow things up.
In general you should also pass -r
to read
, see help read
for what it does.
Why make an extra copy of $1
before processing it into an array? Using
while read -d "" -n 1 ; do astring+=( "$REPLY" ) ; done <<< "$1"
is sufficient.
It's probably not necessary to turn your input into an array before processing. Instead you can slice characters out of the string in your loop, which is closer to what the C++ version is doing. Replace
char="${astring[$x]}"
with
char="${1:$x:1}"
This is operating directly on the function paramter; since we're no longer making a copy of that we also need to get $cnt
a different way
cnt=${#1}
But you really have even bigger problems than this, like the fact that a character isn't an integer in bash. To convert you must use the following syntax:
printf '%d' \'a
where a
is the character to convert. Inserting this into the context of the script it would be
char=$(printf '%d' \'"${1:$x:1}")
Now we're getting somewhere, but I really must ask you to consider whether all of this is really worth it. Even if you can make it work, what do you gain?
just for future reference, here is the awk script that i came up with.
this works just as fast as the C++ code i have, which is basically instantaneous. the bash takes about 10 seconds to run for the same string. the awk is much faster.
function awk_calc_crc16()
{
output=$(echo $1 | awk 'function ord(c){return chmap[c];}
BEGIN{c=65535; for (i=0; i < 256; i++){ chmap[sprintf("%c", i)] = i;}}
{
split($0, chars, "");
for(i = 1; i <= length(chars); i++)
{
cval=ord(chars[i])
e=and(xor(c, ord(chars[i])), 0x00FF);
s=and(lshift(e, 4), 0x00FF);
f=and(xor(e, s), 0x00FF);
r=xor(xor(xor(rshift(c, 8), lshift(f, 8)), lshift(f, 3)), rshift(f, 4));
c=r;
}
}
END{c=xor(c, 0xFFFF); printf("%hu", c);}')
echo $output;
}
shorter (and faster) version
changes:
- the while loop at the beginning is not needed
- r1 is not needed
- cnt is not needed
- use capital letters for bash variables
- one less "\" (backslash) in char=$(printf …
- removed leading 0s for 0xFF
function calc_crc16() {
CRC=0xFFFF
for ((X=0; X<${#1}; X++)); do
CHAR=$(printf '%d' "'${1:$X:1}")
E=$(((CRC ^ CHAR) & 0xFF))
S=$(((E << 4) & 0xFF))
F=$(((E ^ S) & 0xFF))
CRC=$(((CRC >> 8) ^ (F << 8) ^ (F << 3) ^ (F >> 4)))
done
let CRC^=0xFFFF
printf "0x%X\n" $CRC
}