I think the question is essentially 'how does a binary protocol work'? Or `how do I read a (pseudo-)Backus-Naur-Form?
You can think of it like this: Your protocol consists of format information that is used to structure the data, and the data itself. What you see in JSON as an opening bracket {
, for example, means something like "start a new (sub-)document".
Per definition, this 'command' is implicit and simply consists of the length of everything that is to follow, then the content (an e_list
), then a \x00
terminator byte. So, since the document is 22 bytes long (that is 0x16 in hex), the 'command' is \x16\x00\x00\x00
. Why the three \x00
? Because we need an int32, i.e. a 32-bit integer so it must be padded to a full four bytes. Why \x16\x00\x00\x00
and not \x00\x00\x00\x16
? This is called endianess and BSON uses little-endian.
Then comes the defintion of the content, the e_list
. An e_list
is defined as an element
followed by another e_list, which can be empty and then terminates. An element
is defined as the type of the value first, then the e_name
, followed by the actual data. So, since the value of "hello"
is "world"
, which is a string and strings are identified by a \x02
according to the spec, the \x02
comes next, followed by the e_name
"hello" and a null terminator (hello\x00
).
Now comes the actual value which is a string, which is defined as int32 (byte*) "\x00"
, i.e. the length of the string, the actual data and a null terminator (with the length including the null terminator), so the length becomes \x06\x00\x00\x00
, followed by the actual data world\x00
and the \x00
terminator for the top-level BSON document.