Is there a way to get the maximal size of a certain protobuf message after it will be serialized?
I'm referring to messages that don't contain "repeated" elements.
Note that I'm not referring to the size of a protobuf message with a specific content, but to the maximum possible size that it can get to (in the worst case).
The Protobuf serialization mechanism is given through the protoc application, this compiler will parse the . proto file and will generate as output, source files according to the configured language by its arguments, in this case, C++. You can also obtain more information about, reading the section compiler invocation.
No it does not; there is no "compression" as such specified in the protobuf spec; however, it does (by default) use "varint encoding" - a variable-length encoding for integer data that means small values use less space; so 0-127 take 1 byte plus the header.
Benchmark — telemetry data We copied the proto files and data to the benchmark, and got the following results: These were the results we expected — for this data, protobuf is actually slower than JSON.
When using Protobuf on a non-compressed environment, the requests took 78% less time than the JSON requests. This shows that the binary format performed almost 5 times faster than the text format. And, when issuing these requests on a compressed environment, the difference was even bigger.
In general, any Protobuf message can be any length due to the possibility of unknown fields.
If you are receiving a message, you cannot make any assumptions about the length.
If you are sending a message that you built yourself, then you can perhaps assume that it only contains fields you know about -- but then again, you can also easily compute the exact message size in this case.
Thus it's usually not useful to ask what the maximum size is.
With that said, you could write code that uses the Descriptor
interfaces to iterate over the FieldDescriptor
s for a message type (MyMessageType::descriptor()
).
See: https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor
Similar interfaces exist in Java, Python, and probably others.
Here's the rules to implement:
Each field is composed of a tag followed by some data.
For the tag:
For the data:
bool
is always one byte.int32
, int64
, uint64
, and sint64
have a maximum data length of 10 bytes (yes, int32
can be 10 bytes if it is negative, unfortunately).sint32
and uint32
have a maximum data length of 5 bytes.fixed32
, sfixed32
, and float
are always exactly 4 bytes.fixed64
, sfixed64
, and double
are always exactly 8 bytes.If your message contains any of the following, then its maximum length is unbounded:
string
or bytes
. (Unless you know their max length, in which case, it's that max length plus a length prefix, like with sub-messages.)[packed=true]
, in which case you'll have to look up the details.)As far as I know, there is no feature to calculate the maximum size in Google's own protobuf.
Nanopb generator computes the maximum size when possible and exports it as a #define
in the generated file.
It is also quite simple to calculate manually for small messages, based on the protobuf encoding documentation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With