Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protocol Buffers - Best practice for repeated boolean values

I need to transfer some data over a relative slow (down to only 1Kb/s) connection. I have read that the encoding of Googles protocol buffers is efficient. Thats true for most of my data, but not for boolean values, especialy if it is a repeated field. The problem is that I have to transfer, beside other data, a specified number (15) of boolean values every 50 milliseconds. Protobuf is encoding each boolean value into one byte for the field ID and one byte for the boolean value (0x00 or 0x01) which results in 30 bytes of data for 15 boolean values.

So I am searching for a better way of encoding this now. Anybody also had this problem already? What would be the best practice to reach a efficient encoding for this situation?

My idea was to use a numbered data type (uint32) and manual encode the data, for every bool one bit of the integer. Any feedback about this idea?

like image 348
Stefan Avatar asked Jan 14 '15 10:01

Stefan


1 Answers

In Protobuf, your best bet is to use an integer bitfield. If you have more than 64 bits, use a bytes field (and pack the bits manually).

Note that Cap'n Proto will pack boolean values (in both structs and lists) as individual bits, and so may be worth looking at.

However, if you are extremely bandwidth-constrained, it may be best to develop your own custom protocol. Most of these serialization frameworks trade-off a little bit of space for ease of use (especially when it comes to dealing with version skew), but if your case it may be more important to focus solely on size. A custom message format that just contains some bits should be easy enough to maintain and can be packed as tightly as you want.

(Disclosure: I am the author of Cap'n Proto, as well as most of Google's open source Protobuf code.)

like image 150
Kenton Varda Avatar answered Sep 21 '22 15:09

Kenton Varda