BSON how does a binary protocol work'? how do I read a (pseudo-)Backus-Naur-Form? [closed]

StackOverflow https://stackoverflow.com/questions/19722170

  •  02-07-2022
  •  | 
  •  

문제

I'm reading MongoDB specs and it uses data format BSON

Looking at the doc, I'd like to understand how the example BSON at the bottom of their page is encoded

{"hello": "world"}  →   "\x16\x00\x00\x00\x02hello\x00\x06\x00\x00\x00world\x00\x00"

{"BSON": ["awesome", 5.05, 1986]}   →   "\x31\x00\x00\x00\x04BSON\x00\x26\x00 
 \x00\x00\x020\x00\x08\x00\x00 
 \x00awesome\x00\x011\x00\x33\x33\x33\x33\x33\x33
 \x14\x40\x102\x00\xc2\x07\x00\x00 
 \x00\x00"
도움이 되었습니까?

해결책

I think the question is essentially 'how does a binary protocol work'? Or `how do I read a (pseudo-)Backus-Naur-Form?

You can think of it like this: Your protocol consists of format information that is used to structure the data, and the data itself. What you see in JSON as an opening bracket {, for example, means something like "start a new (sub-)document".

Per definition, this 'command' is implicit and simply consists of the length of everything that is to follow, then the content (an e_list), then a \x00 terminator byte. So, since the document is 22 bytes long (that is 0x16 in hex), the 'command' is \x16\x00\x00\x00. Why the three \x00? Because we need an int32, i.e. a 32-bit integer so it must be padded to a full four bytes. Why \x16\x00\x00\x00 and not \x00\x00\x00\x16? This is called endianess and BSON uses little-endian.

Then comes the defintion of the content, the e_list. An e_list is defined as an element followed by another e_list, which can be empty and then terminates. An element is defined as the type of the value first, then the e_name, followed by the actual data. So, since the value of "hello" is "world", which is a string and strings are identified by a \x02 according to the spec, the \x02 comes next, followed by the e_name "hello" and a null terminator (hello\x00).

Now comes the actual value which is a string, which is defined as int32 (byte*) "\x00", i.e. the length of the string, the actual data and a null terminator (with the length including the null terminator), so the length becomes \x06\x00\x00\x00, followed by the actual data world\x00 and the \x00 terminator for the top-level BSON document.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top