Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protocol Buffers - Reading header (nested message) common across all messages

I am currently evaluating Protocol Buffers for use in a project (no code written as of yet). One of the things I'm unclear on is how you would read part of an encoded message, for example say I have a common header:

message Header {
  required uint16 msg_type = 1;
  required uint16 length = 2;
}

And say I deliver multiple different messages to a queue. How would the consumer work out how much data to read per message and what message type is should be constructed as?

like image 548
Graeme Avatar asked Nov 13 '12 14:11

Graeme


People also ask

What are protocol buffers used for?

Protocol buffers, usually referred as Protobuf, is a protocol developed by Google to allow serialization and deserialization of structured data. Google developed it with the goal to provide a better way, compared to XML, to make systems communicate.

How does Protobuf encoding work?

A protocol buffer message is a series of key-value pairs. The binary version of a message just uses the field's number as the key -- the name and declared type for each field can only be determined on the decoding end by referencing the message type's definition (that is, the . proto file).

What is protocol buffers gRPC?

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.

How does Protobuf serialize?

The Protobuf serialization mechanism is given through the protoc application, this compiler will parse the . proto file and will generate as output, source files according to the configured language by its arguments, in this case, C++. You can also obtain more information about, reading the section compiler invocation.


1 Answers

There should be no need for a Header message here; the most common approach is to follow the "streaming" advice from here. Within that, you could either treat it as a sequence of identical union type messages, or (my preference) when writing, instead of just writing a length-prefix before each, include a varint that indicates the message type then the length (as a varint). The number that indicates the message type is some arbitrary map you invent, so 1 = Foo, 2 = Bar, 3 = Blap, etc). If you left-shift the message-type by 3 bits then "or" 2, then it will also be a well-formed protobuf stream itself, 100% identical to a repeated YourUnionType.

Basically, this is exactly the same as this answer, but instead of being field 1 each time, the number varies per message-type. Most implementations have a reader/writer API that make it possible to read and write raw varints, and to length-restrict the reader API. Some implementations have helper mechanisms to support streams of heterogeneous messages directly (basically, doing all the above for you).

like image 56
Marc Gravell Avatar answered Sep 18 '22 01:09

Marc Gravell