Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the preferred way to encode a "nullable" field in protobuf 2?

I am defining a ProtoBuf message where I want to have a "nullable" field -- i.e., I want to distinguish between the field having a value and not having a value. As a concrete example, let's say I have "x" and "y" fields to record the coordinates of some object. But in some cases, the coordinates are not known. The following definition will not work, because if x or y are unspecified, then they default to zero (which is a valid value):

message MyObject {
    optional float x = 1;
    optional float y = 2;
}

One option would be to add a boolean field recording whether the corresponding field's value is known or not. I.e.:

message MyObject {
    optional bool has_x = 1; // if false, then x is unknown.
    optional bool has_y = 2; // if false, then y is unknown.
    optional float x = 3; // should only be set if has_x==true.
    optional float y = 4; // should only be set if has_y==true.
}

But this imposes some extra book-keeping -- e.g., when I set the x field's value, I must always remember to also set has_x. Another option would be to use a list value, with the convention that the list always has either length 0 or length 1:

message MyObject {
    repeated float x = 1; // should be empty or have exactly 1 element.
    repeated float y = 2; // should be empty or have exactly 1 element.
}

But in this case, the definition seems a bit misleading, and the interface isn't much better.

Is there a third option that I haven't thought of that's better than these two? How have you dealt with storing nullable fields in protobuf?

like image 343
Edward Loper Avatar asked Feb 07 '12 21:02

Edward Loper


People also ask

Are protobuf fields nullable?

Protobuf treats strings as primitive types and therefore they can not be null.

Is Google protobuf timestamp Nullable?

Protobuf messages are either present (possibly default) valued or optional but they can't be null.

How are Protobufs encoded?

Since protobuf uses tags to identify field number of a field, there is no point in relying on the order of encoded values. If it is an array of primitive numeric types (integer, float, double), then they are encoded within single key-value (tag + length of bytes + encoded bytes) pair.


2 Answers

Protobuf 2 messages have a built-in notion of "nullable fields". The C++ interface contains methods has_xxx and clear_xxx to check if the field has been set and to unset the field, respectively.

This feature comes "for free" due to the way fields are encoded in message using "tags". An unset field is simply "not present" in the encoded message.

Proto 3 does not have this feature, instead setting any missing field to its default value.

like image 81
JesperE Avatar answered Sep 30 '22 17:09

JesperE


Have a notion of NaN for each of the types and then use default (as shown below) to set it as the value. This will be used if nothing is specified for that particular field.

optional float x = 1 [default = -1];
like image 40
Aravind Yarram Avatar answered Sep 30 '22 18:09

Aravind Yarram