Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

google protobuf maximum size

I have some repeating elements in my protobuf message. At runtime the length of the message could be anything - I see some questions already asked like this one - [1]: Maximum serialized Protobuf message size

  1. I have a slightly different question here. If my JMS (Java Messaging service) provider (in this case my weblogic or tibco jms server) doesn't have any size limit on the max message size, will protocol buffer compiler complain at all about the maximum message size ?
  2. Does the performance of encoding/decoding suffer horribly at large sizes (around ~10MB)..?
like image 725
Robin Bajaj Avatar asked Dec 07 '15 08:12

Robin Bajaj


People also ask

Is protobuf faster than JSON?

Benchmark — telemetry data We copied the proto files and data to the benchmark, and got the following results: These were the results we expected — for this data, protobuf is actually slower than JSON.

Does protobuf compress data?

No it does not; there is no "compression" as such specified in the protobuf spec; however, it does (by default) use "varint encoding" - a variable-length encoding for integer data that means small values use less space; so 0-127 take 1 byte plus the header.

What is faster than protobuf?

Cap'n Proto is an insanely fast data interchange format and capability-based RPC system. Think JSON, except binary. Or think Protocol Buffers, except faster. In fact, in benchmarks, Cap'n Proto is INFINITY TIMES faster than Protocol Buffers.

How efficient is protobuf?

When using Protobuf on a non-compressed environment, the requests took 78% less time than the JSON requests. This shows that the binary format performed almost 5 times faster than the text format. And, when issuing these requests on a compressed environment, the difference was even bigger.


2 Answers

10MB is pushing it but you'll probably be OK.

Protobuf has a hard limit of 2GB, because many implementations use 32-bit signed arithmetic. For security reasons, many implementations (especially the Google-provided ones) impose a size limit of 64MB by default, although you can increase this limit manually if you need to.

The implementation will not "slow down" with large messages per se, but the problem is that you must always parse an entire message at once before you can start using any of the content. This means the entire message must fit into RAM (keeping in mind that after parsing the in-memory message objects are much larger than the original serialized message), and even if you only care about one field you have to wait for the whole thing to parse.

Generally I recommend trying to limit yourself to 1MB as a rule of thumb. Beyond that, think about splitting the message up into multiple chunks that can be parsed independently. However, every application -- for some, 10MB is no big deal, for others 1MB is already way too large. You'll have to profile your own app to find out.

I've actually seen cases where people were happy sending messages larger than 1GB, so... it "works".

On a side note, Cap'n Proto has a very similar design to Protobuf but can support messages up to 2^64 bytes (2^32 segments of 4GB each), and it actually does allow you to read one field from the message without parsing the whole message (if it's in a file on disk, use mmap() to avoid reading the whole thing in).

(Disclosure: I'm the author of Cap'n Proto as well as most of Google's open source Protobuf code.)

like image 178
Kenton Varda Avatar answered Oct 07 '22 16:10

Kenton Varda


  1. I don't think the protobuf compiler will ever complain about message sizes. Atleast not until you get to the 18 exabyte maximum of uint64_t.

  2. For most implementations, performance starts to suffer at the point where the message cannot fit into RAM at once. So 10 MB should be fine, 10 GB not. Another possible issue is if you don't need all of the data - protobuf does not support random access, so you need to decode the whole message even if you only need a part of it.

like image 31
jpa Avatar answered Oct 07 '22 16:10

jpa