Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there C++ equivalents for the Protocol Buffers delimited I/O functions in Java?

I'm trying to read / write multiple Protocol Buffers messages from files, in both C++ and Java. Google suggests writing length prefixes before the messages, but there's no way to do that by default (that I could see).

However, the Java API in version 2.1.0 received a set of "Delimited" I/O functions which apparently do that job:

parseDelimitedFrom mergeDelimitedFrom writeDelimitedTo 

Are there C++ equivalents? And if not, what's the wire format for the size prefixes the Java API attaches, so I can parse those messages in C++?


Update:

These now exist in google/protobuf/util/delimited_message_util.h as of v3.3.0.

like image 535
tzaman Avatar asked Feb 26 '10 09:02

tzaman


People also ask

What is protoc C?

This is protobuf-c , a C implementation of the Google Protocol Buffers data serialization format. It includes libprotobuf-c , a pure C library that implements protobuf encoding and decoding, and protoc-c , a code generator that converts Protocol Buffer . proto files to C descriptor code, based on the original protoc .

Is protocol buffer an IDL?

The Protocol Buffers IDL is a custom, platform-neutral language with an open specification. Developers author . proto files to describe services, along with their inputs and outputs.

How efficient is Protobuf?

When using Protobuf on a non-compressed environment, the requests took 78% less time than the JSON requests. This shows that the binary format performed almost 5 times faster than the text format. And, when issuing these requests on a compressed environment, the difference was even bigger.


2 Answers

I'm a bit late to the party here, but the below implementations include some optimizations missing from the other answers and will not fail after 64MB of input (though it still enforces the 64MB limit on each individual message, just not on the whole stream).

(I am the author of the C++ and Java protobuf libraries, but I no longer work for Google. Sorry that this code never made it into the official lib. This is what it would look like if it had.)

bool writeDelimitedTo(     const google::protobuf::MessageLite& message,     google::protobuf::io::ZeroCopyOutputStream* rawOutput) {   // We create a new coded stream for each message.  Don't worry, this is fast.   google::protobuf::io::CodedOutputStream output(rawOutput);    // Write the size.   const int size = message.ByteSize();   output.WriteVarint32(size);    uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);   if (buffer != NULL) {     // Optimization:  The message fits in one buffer, so use the faster     // direct-to-array serialization path.     message.SerializeWithCachedSizesToArray(buffer);   } else {     // Slightly-slower path when the message is multiple buffers.     message.SerializeWithCachedSizes(&output);     if (output.HadError()) return false;   }    return true; }  bool readDelimitedFrom(     google::protobuf::io::ZeroCopyInputStream* rawInput,     google::protobuf::MessageLite* message) {   // We create a new coded stream for each message.  Don't worry, this is fast,   // and it makes sure the 64MB total size limit is imposed per-message rather   // than on the whole stream.  (See the CodedInputStream interface for more   // info on this limit.)   google::protobuf::io::CodedInputStream input(rawInput);    // Read the size.   uint32_t size;   if (!input.ReadVarint32(&size)) return false;    // Tell the stream not to read beyond that size.   google::protobuf::io::CodedInputStream::Limit limit =       input.PushLimit(size);    // Parse the message.   if (!message->MergeFromCodedStream(&input)) return false;   if (!input.ConsumedEntireMessage()) return false;    // Release the limit.   input.PopLimit(limit);    return true; } 
like image 171
Kenton Varda Avatar answered Sep 19 '22 06:09

Kenton Varda


Okay, so I haven't been able to find top-level C++ functions implementing what I need, but some spelunking through the Java API reference turned up the following, inside the MessageLite interface:

void writeDelimitedTo(OutputStream output) /*  Like writeTo(OutputStream), but writes the size of      the message as a varint before writing the data.   */ 

So the Java size prefix is a (Protocol Buffers) varint!

Armed with that information, I went digging through the C++ API and found the CodedStream header, which has these:

bool CodedInputStream::ReadVarint32(uint32 * value) void CodedOutputStream::WriteVarint32(uint32 value) 

Using those, I should be able to roll my own C++ functions that do the job.

They should really add this to the main Message API though; it's missing functionality considering Java has it, and so does Marc Gravell's excellent protobuf-net C# port (via SerializeWithLengthPrefix and DeserializeWithLengthPrefix).

like image 43
tzaman Avatar answered Sep 19 '22 06:09

tzaman