Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to decode binary/raw google protobuf data

I have a coredump with encoded protobuf data and I want to decode this data and see the content. I have the .proto file which defines this message in raw protocol buffer. My proto file looks like this:

$  cat my.proto  message header {   required uint32 u1 = 1;   required uint32 u2 = 2;   optional uint32 u3 = 3 [default=0];   optional bool   b1 = 4 [default=true];   optional string s1 = 5;   optional uint32 u4 = 6;   optional uint32 u5 = 7;   optional string s2 = 9;   optional string s3   = 10;    optional uint32 u6 = 8; } 

And protoc version:

$  protoc --version libprotoc 2.3.0 

I have tried the following:

  1. Dump the raw data from the core

    (gdb) dump memory b.bin 0x7fd70db7e964 0x7fd70db7e96d

  2. Pass it to protoc

    //proto file (my.proto) is in the current dir
    $ protoc --decode --proto_path=$pwd my.proto < b.bin
    Missing value for flag: --decode
    To decode an unknown message, use --decode_raw.

    $ protoc --decode_raw < /tmp/b.bin
    Failed to parse input.

Any thoughts on how to decode it? The documentation doesn’t explain much on how to go about it.

Edit: Data in binary format (10 bytes)

(gdb) x/10xb 0x7fd70db7e964 0x7fd70db7e964: 0x08    0xff    0xff    0x01    0x10    0x08    0x40    0xf7 0x7fd70db7e96c: 0xd4    0x38 
like image 430
brokenfoot Avatar asked Jan 27 '16 22:01

brokenfoot


People also ask

Is protobuf binary format?

A protocol buffer message is a series of key-value pairs. The binary version of a message just uses the field's number as the key – the name and declared type for each field can only be determined on the decoding end by referencing the message type's definition (i.e. the . proto file).

How do you read protobuf?

The Protobuf is a binary transfer format, meaning the data is transmitted as a binary. This improves the speed of transmission more than the raw string because it takes less space and bandwidth. Since the data is compressed, the CPU usage will also be less.

Is protobuf text or binary?

Protobuf is a binary format, so working with it becomes tedious.

Does protobuf handle endianness?

Protocol buffers messages always use little-endian encoding.


1 Answers

You used --decode_raw correctly, but your input does not seem to be a protobuf.

For --decode, you need to specify the type name, like:

protoc --decode header my.proto < b.bin 

However, if --decode_raw reports a parse error than --decode will too.

It would seem that the bytes you extracted via gdb are not a valid protobuf. Perhaps your addresses aren't exactly right: if you added or removed a byte at either end, it probably won't parse.

I note that according to the addresses you specified, the protobuf is only 9 bytes long, which is only enough space for three or four of the fields to be set. Is that what you are expecting? Perhaps you could post the bytes here.

EDIT:

The 10 bytes you added to your question appear to decode successfully using --decode_raw:

$ echo 08ffff01100840f7d438 | xxd -r -p | protoc --decode_raw 1: 32767 2: 8 8: 928375 

Cross-referencing the field numbers, we get:

u1: 32767 u2: 8 u6: 928375 
like image 180
Kenton Varda Avatar answered Sep 21 '22 12:09

Kenton Varda