Suppose I have a proto with a bytes field:
message MyProto {
optional bytes data = 1;
}
An API that I do not control gives me a pointer to source data and its size. I want to make a MyProto
out of this data without deep copying. I thought this would be easy to do, but it appears to be impossible. Deep copying is easy with set_data
. Protobuf provides a set_allocated_data
function, but it takes a pointer to a std::string
, which does not help me, since (unless I'm mistaken) there is no way to make a std::string
without deep copying into it.
void populateProto(void* data, size_t size, MyProto* message) {
// Deep copy is fine, I guess.
message->set_data(data, size);
// Shallow copy would be better...
// message->set_allocated_data( ??? );
}
Is there any way to properly populate this proto (such that it can be serialized later) without deep copying the source data into the bytes field?
I'm aware that I could manually do the serializing right away, but I'd rather not, if possible.
Protocol buffers messages always use little-endian encoding. Implementations running on big-endian architectures should be doing the conversions automatically. If you are receiving data in wrong order, I would suggest using protoc --decode_raw to see whether the error occurs on the transmission or reception side.
Variable-width integers, or varints, are at the core of the wire format. They allow encoding unsigned 64-bit integers using anywhere between one and ten bytes, with small values using fewer bytes. Each byte in the varint has a continuation bit that indicates if the byte that follows it is part of the varint.
Protocol buffers are a combination of the definition language (created in .proto files), the code that the proto compiler generates to interface with data, language-specific runtime libraries, and the serialization format for data that is written to a file (or sent across a network connection).
Encodes text into a sequence of bytes using the named charset and returns the result as a ByteString . static ByteString. copyFrom(java.lang.String text, java.lang.String charsetName) Encodes text into a sequence of bytes using the named charset and returns the result as a ByteString .
Great question. The options are:
UPDATE: StringPiece
is obsolete according to an online developer discussion, which may render this option moot. If you can alter your .proto file, consider implementing the ctype
field option for StringPiece
, Google's equivalent of C++17 string_view
. This is how Google would handle such a case internally. The FieldOptions
message already has semantics for StringPiece, but Google has not yet open-sourced the implementation.
message MyProto {
bytes data = 1 [ctype = STRING_PIECE];
}
Use a different protocol buffer implementation, perhaps only for this particular message type. protobuf-c and protobluff are C-language implementations that look promising.
Feed a buffer to your 3rd party API. I see from the comments that you can't, but I'm including it for completeness.
::std::string * buf = myProto->mutable_data();
buf->resize(size);
api(buf->data(), size); /* data is contiguous per c++11 std */
NON STANDARD: Break encapsulation by overwriting the data in a string instance. C++ has some gnarly features that give you enough rope to hang yourself. This option is not safe and depends on your std::string
implementation and other factors.
// NEVER USE THIS IN PRODUCTION
void string_jam(::std::string * target, void * buffer, size_t len) {
/* On my system, std::string layout
* 0: size_t capacity
* 8: size_t size
* 16: char * data (iff strlen > 22 chars) */
assert(target->size() > 22);
size_t * size_ptr = (size_t*)target;
size_ptr[0] = len; // Overwrite capacity
size_ptr[1] = len; // Overwrite length
char ** buf_ptr = (char**)(size_ptr + 2);
free(*buf_ptr); // Free the existing buffer
*buf_ptr = (char*)buffer; // Jam in our new buffer
}
Note: Don't do this in production. This is useful for testing to measure the performance impact if you did go the zero-copy route.
If you go with option #1, it would be great if you could release the source code, as many others would benefit from this capability. Best of luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With