Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much memory is used by "missing optional"/"empty repeated" fields in ProtoBuf?

I'm trying to design my first file format in ProtoBuf, and I'm not sure what is the best choice in some cases, because the memory/stream layout is not totally clear to me.

So I have in fact several questions, but all closely related:

1) What does an optional field cost, when it is omitted?

I think it should only cost one bit, since a bit-field can be used to flag present/absent fields, but I don't know for sure. They might instead use a whole byte per optional field.

2) What does a repeated field cost when it is empty? Is it also one bit, like the optional field, or is it "field header" + one (varint) byte to say it is size 0?

3) Since "bytes" implicitly has a size, is there actually a size difference between a missing optional bytes field, and an empty required bytes field?

[EDIT] By "memory" I meant space used on the file-system or network bandwidth; I did not mean RAM, since this would be programming-language-dependent.

like image 891
Sebastien Diot Avatar asked Dec 27 '11 22:12

Sebastien Diot


People also ask

Is repeated optional Protobuf?

You don't need the optional modifier, and it looks like it is confusing the parser. A repeated field is inherently optional : you just don't add any values. As for com.

What is repeated field in Protobuf?

repeated : this field can be repeated any number of times (including zero) in a well-formed message. The order of the repeated values will be preserved.

Are repeated fields ordered in Protobuf?

Yes, repeated fields retain the order of items.

What is Google Protobuf empty?

protobuf. Empty states: A generic empty message that you can re-use to avoid defining duplicated empty messages in your APIs. A typical example is to use it as the request or the response type of an API method.


1 Answers

1: nothing whatsoever - it is omitted completely on the wire

2: nothing whatsoever - only actual contents are included; an empty list is essentially omitted (possible exception: empty "packed" arrays; although even that could legitimately be omitted)

3: omitted costs nothing; present and zero-length costs at least 2 bytes - one field header (length depends on field number; low field numbers < 32 take 1 byte), and one length of zero (one byte)

Additional note: protobuf never uses sub-byte packing, so any field always uses an entire number of bytes.

(context: I've written a protobuf implementation from first principles, so the encoding details are fairly familiar to me)

like image 61
Marc Gravell Avatar answered Oct 23 '22 13:10

Marc Gravell