Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protocol buffers - unique numbered tag - clarification?

I'm using protocol buffers and everything is working fine. except that the fact that I don't understand - why do I need the numbered tags in the proto file :

message SearchRequest {   required string query = 1;   optional int32 page_number = 2;   optional int32 result_per_page = 3; } 

Sure I've read the docs :

As you can see, each field in the message definition has a unique numbered tag. These tags are used to identify your fields in the message binary format, and should not be changed once your message type is in use.

I didn't understand what difference does it make if I change it . ( I will create a new proto and compile it - so why does it care ?)

Another article states that :

Numbered fields in proto definitions obviate the need for version checks which is one of the explicitly stated motivations for the design and implementation of Protocol Buffers. As the developer documentation states, the protocol was designed in part to avoid “ugly code” like this for checking protocol versions:

if (version == 3) {   ... } else if (version > 4) {   if (version == 5) {     ...   }   ... } 

Question

Is it just me or it is completely unclear ?

let me ask it in a different way :

If I have a proto file like the above file , and then I change it to :

message SearchRequest {   required string query = 3; //reversed order   optional int32 page_number = 2;   optional int32 result_per_page = 1; } 

What does it care ? I re-compile and add the file ( i've done it multiple times in the last week).

what am I missing ? can you please supply a human-to human explanation for this numbered tags ?

like image 548
Royi Namir Avatar asked Nov 09 '14 08:11

Royi Namir


People also ask

What do the numbers mean in Protobuf?

Field numbers are an important part of Protobuf. They're used to identify fields in the binary encoded data, which means they can't change from version to version of your service.

What is the difference between proto2 and proto3?

Proto3 is the latest version of Protocol Buffers and includes the following changes from proto2: Field presence, also known as hasField , is removed by default for primitive fields. An unset primitive field has a language-defined default value.

How do you define a Protobuf?

Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data.

What is oneof in Protobuf?

Protocol Buffer (Protobuf) provides two simpler options for dealing with values that might be of more than one type. The Any type can represent any known Protobuf message type. And you can use the oneof keyword to specify that only one of a range of fields can be set in any message.


1 Answers

The numbered tags are used to match fields when serializing and deserializing the data.

Obviously, if you change the numbering scheme, and apply this change to both serializer and deserializer, there is no issue.

Consider though, if you saved data with the first numbering scheme, and loaded it with the second one, it would try to load query into result_per_page, and deserialization would likely fail.

Now, why is this useful? Let's say you need to add another field to your data, long after the schema is already in use:

message SearchRequest {   required string query = 1;   optional int32 page_number = 2;   optional int32 result_per_page = 3;   optional int32 new_data = 4; } 

Because you explicitly give it a number, your deserializer is still able to load data serialized with the old numbering scheme, ignoring deserialization of non-existent data.

like image 143
Rotem Avatar answered Sep 24 '22 16:09

Rotem