Question

The syntax 3 of protobuf made all the fields optional dropping the keywords required and optional from previous proto2 syntax. Reading some comments from developers it seems that it was done for enhancing forward/backward binary compatibility.

But for me, that could be enforced by just versioning the package names, say com.example.messages.v1 and then let clients to implement deserializers they understand. At the same time it removes some contracts stated as a type that are useful from a software engineering point of view. For instance if I have

message Location {
   double latitude = 1;
   double longitude = 2;
}

In proto3 it is possible to create a half backed but perfectly valid Location by not providing one of the required fields.

Isn't that a big drawback when creating a schema based serialization format for exchanging data between clients? Isn't worse to move extra validation code to each client checking that all required fields have valid values?

Was it helpful?

Solution

proto3 makes a number of changes aimed (as I understand it) at making it far more usable in cross-platform scenarios. Explicit tracking of "assigned" vs "not assigned but reporting the default value" can be very hard to implement on some of the target platforms, and can also be confusing to use. As such, proto3 adopts a much simpler approach:

  • the implicit default value is the natural zero value (numbers / enums), false (booleans) or the empty string (strings)
  • only the implicit default is allowed; no other default values are permitted
  • if a field has that default value, it is not serialized; it doesn't matter whether it was assigned explicitly to zero / false / empty-string vs never assigned
  • because of this, there is no concept of "required", since "explicitly assigned a zero value" and "never assigned a value" look identical

In proto3 it is possible to create a half backed but perfectly valid Location by not providing one of the required fields.

The other value is: zero. The fact that you didn't explicitly assign it to zero is moot. Whether or not this is desirable is up to you, but it makes sense to me and is how a lot of "initialize a new object / struct" works on a wide range of platforms.

Isn't worse to move extra validation code to each client checking that all required fields have valid values?

There's nothing to validate! The layout is exactly what it would be if the value zero was explicitly assigned. If that is legal, it is legal. If it is illegal (because zero doesn't make sense to you), it is illegal; but it would be illegal whether it was explicit or implicit. The amount of validation involved doesn't change.

Isn't that a big drawback when creating a schema based serialization format for exchanging data between clients?

Not usually, no... especially since the schema version is explicit. If you want to use proto2: use proto2. Nothing changes automatically.

Licensed under: CC-BY-SA with attribution
scroll top