The Python version of Google protobuf gives us only:
SerializeAsString()
Where as the C++ version gives us both:
SerializeToArray(...)
SerializeAsString()
We're writing to our C++ file in binary format, and we'd like to keep it this way. That said, is there a way of reading the binary data into Python and parsing it as if it were a string?
Is this the correct way of doing it?
binary = get_binary_data()
binary_size = get_binary_size()
string = None
for i in range(len(binary_size)):
string += i
message = new MyMessage()
message.ParseFromString(string)
Here's a new example, and a problem:
message_length = 512
file = open('foobars.bin', 'rb')
eof = False
while not eof:
data = file.read(message_length)
eof = not data
if not eof:
foo_bar = FooBar()
foo_bar.ParseFromString(data)
When we get to the foo_bar.ParseFromString(data)
line, I get this error:
Exception Type: DecodeError
Exception Value: Too many bytes when decoding varint.
It turns out, that the padding on the binary data was throwing protobuf off; too many bytes were being sent in, as the message suggests (in this case it was referring to the padding).
This padding comes from using the C++ protobuf function, SerializeToArray
on a fixed-length buffer. To eliminate this, I have used this temproary code:
message_length = 512
file = open('foobars.bin', 'rb')
eof = False
while not eof:
data = file.read(message_length)
eof = not data
string = ''
for i in range(0, len(data)):
byte = data[i]
if byte != '\xcc': # yuck!
string += data[i]
if not eof:
foo_bar = FooBar()
foo_bar.ParseFromString(string)
There is a design flaw here I think. I will re-implement my C++ code so that it writes variable length arrays to the binary file. As advised by the protobuf documentation, I will prefix each message with it's binary size so that I know how much to read when I'm opening the file with Python.
I'm not an expert with Python, but you can pass the result of a file.read()
operation into message.ParseFromString(...)
without having to build a new string type or anything.
Python strings can contain any character, i.e. they are capable of holding "binary" data directly. There should be no need to convert from string to "binary".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With