You can work this out step by step. Let's skip the whitespace, equal-sign and invalid characters and the code to do with padding at the end and focus on the loop and the defualt clause:
size_t buf = 1;
while (in < end) {
unsigned char c
/* read next byte */
c = d[*in++];
/* append byte to number */
buf = buf << 6 | c;
/* If the buffer is full, split it into bytes */
if (buf & 0x1000000) {
*out++ = buf >> 16;
*out++ = buf >> 8;
*out++ = buf;
buf = 1;
}
}
The code is read byte by byte and then appended to buf
. The input comes in 6-ybit-chunks and the output should be 8-bit-chunks, aka bytes. (Illegal input characters are those with any of the top two bits set.)
The idea is to use buf
as an auxiliary buffer that stores four six-bit values until it is full. Then, write the contents of that buffer out as three eight-bit values.
We start with buf == 1
:
.... .... .... .... .... .... .... ...1
Empty bits are represented as dots here, it's easier to read than zeros. The 1
is the sentinel value. Okay, read teh next byte, denoted by a
. Shift the buffer by six places
.... .... .... .... .... .... .1.. .... // buf = buf << 6
and do a logical or with the data:
.... .... .... .... .... .... .1aa aaaa // buf = buf | 'a'
Okay, next byte, 'b':
.... .... .... .... ...1 aaaa aa.. .... // buf = buf << 6
.... .... .... .... ...1 aaaa aabb bbbb // buf = buf | 'b'
Next byte, 'c':
.... .... .... .1aa aaaa bbbb bb.. .... // buf = buf << 6
.... .... .... .1aa aaaa bbbb bbcc cccc // buf = buf | 'c'
And 'd':
.... ...1 aaaa aabb bbbb cccc cc.. .... // buf = buf << 6
.... ...1 aaaa aabb bbbb cccc ccdd dddd // buf = buf | 'd'
Now look whether the buffer is full. (This is done after every byte read, but I've left it out for clarity.) This is done by bit-wise anding buf
with 0x1000000
:
.... ...1 aaaa aabb bbbb cccc ccdd dddd // buf
.... ...1 .... .... .... .... .... .... // 0x1000000
.... ...1 .... .... .... .... .... .... // buf & 0x1000000
This value is now true for the first time, which means we've read four six-bit chunks and we need to write the data as three eight bit chunks now.
.... .... .... .... .... ...1 aaaa aabb // buf >> 16
.... .... .... ...1 aaaa aabb bbbb cccc // buf >> 8
.... ...1 aaaa aabb bbbb cccc ccdd dddd // buf
These values are written to bytes, i.e. unsigned chars, which will truncate them to the lowest eight bits:
---- ---- ---- ---- ---- ---- aaaa aabb // (uchar) (buf >> 16)
---- ---- ---- ---- ---- ---- bbbb cccc // (uchar) (buf >> 8)
---- ---- ---- ---- ---- ---- ccdd dddd // (uchar) buf
Now, reset the buf
to 1
and read the next bytes.