Question

Message Data{
   optional uint64 userid = 1;
   repeated string usernames = 2;
   optional uint32 status = 3;
}

Case 1 - 'status' contains non-zero value'

If status = non-zero value When I use the data__pack(const Data * message ,uint8_t * out) function to pack the data, the length of 'out' is 100, and the function returns 100 (The value 100 is used as an example).

Case 2 - 'status' contains value 0

If status = 0 (for same values of userid and username as in Case 1) When I use the data__pack() function to pack the data, the length of 'out' is 99 but the function still returns 100.

If you notice the 2 cases above, if a value of 0 is used for the variable 'status', the length of the output buffer is always 1 less than the length if 'status' would contain a non-zero value. I have set the has_status variable to 1, so that doesn't seem to be an issue.

Is this a protobuf bug or am I doing something wrong?

Additional details: Protobuf version I'm using - protobuf-c-0.15

Additonal details on the issue

Data msg = DATA__INIT;
void* buf = (void*) NULL;
int buf_len = 0;

buf_len = data__get_packed_size(&msg);
buf = (void*) malloc(buf_len+1);

data__pack(&msg, buf);

I use data__get_packed_size() to obtain the size. While sending over an HTTP request to the intended receiver, I send a Content-Length of 'buf_len', and 'buf' is used for the HTTP body.

Content-Length: 100
Content-Type: application/protobuf

<Body containing the protobuf encoded data - seems to have 99 bytes>

The receiver now continues to wait until it receives the complete request (i.e. 100) and the connection times out eventually. In this case, how would the receiver know how to handle this since it would only wait for the Content-Length of 100 to be received?

Was it helpful?

Solution

Encoding a uint32 requires two varints, each of which is at least one byte long. The first byte contains the encoding ("wire") type and the tag number, and will be one byte long if the tag is at most 15. The second varint is the integer itself, which will be one byte long if the integer is at most 127. An omitted ("defaulted") optional field does not occupy any bytes at all; omitted fields are not encoded in any way on the wire; their absence must be deduced.

Consequently, If an optional uint32 field is omitted, the wire encoding will be at least two bytes shorter than the same protobuf with the optional field included.

You don't indicate how you are computing "the length of 'out'". The only correct way is to call data__get_packed_size prior to packing the data object. I'm guessing, however, that you are using strlen(out), which will not produce the correct result. strlen can only be used on strings; more specifically, it can only be used on NUL-terminated strings, because it stops counting when it hits a NUL (0) byte, and produces undefined behaviour if it does not see a NUL byte. You must not use strlen on arbitrary binary data.

In the protobuf encoding, an explicit uint32 with a tag of 3 and a value of 0 will be encoded as 0x18 0x00, while an explicit uint32 with a tag of 3 and a value of 42 will be encoded as 0x18 0x2A. If there were no NULs earlier in the data encoding, then the second byte of the encoding of 0 will terminate strlen's count. Under certain ideal circumstances (say, the buffer is more than long enough, and is cleared to all NULs prior to packing the message into it), strlen will report the length of the protobuf with 0 as one "character" shorter than the protobuf with 42, because the NUL in the encoding of the 0 occurs at the end of the message, causing strlen to stop counting exactly one byte early.

That's an explanation, not a workaround. DO NOT use strlen on things which are not NUL-terminated strings.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top