<p>I'd like to find a way to convert a binary protobuf message into a human readable description of the contained data, without using the .proto files.</p> <p>The background is that I have a .proto message that it being rejected by the parser on Android, but it's not entirely clear why. I could go through the message by hand, but it's rather tedious.</p> <p>I tried <code>protoc --decode_raw</code>, but it just gives the error "Failed to parse input.". I google hoping/expecting someone would have done a nice web utility that might do this, but haven't found anything obvious.</p> <p>I'm just hoping to get some output like:</p> <pre class="prettyprint"><code>field 1: varint: 128 field 4: string: "foo" </code></pre> <p>Any pointers in the right direction would be most welcome!</p>

<p>For Posterity: Google's protocol buffer tools have the ability to decode raw buffers.</p> <p>Just send the unknown buffer to it and pass the<code>--decode_raw</code> flag</p> <pre class="prettyprint"><code>$ protoc --decode_raw < has_no_proto.buff 2 { 2: "Error retrieving information from server. [RH-02]" } </code></pre> <p>So here's a message with field 2 set to an embedded message which in turn has its second field set to a string telling me I pissed off Google Play.</p> <p>Type information isn't definite (it looks like it will try to display all binary data as strings -- but your requirement for varint/string/submessage distinction is met).</p>

<p>As noted in Michel de Ruiter's answer, it's possible that your protobuf message has a length-prefix. Assuming it does, this answer should help.</p> <p>(NOTE: For most of the commands below, I'm assuming your protobuf message is stored in a file called <code>input</code>.)</p> <h3> <code>protoc --decode_raw</code> + <code>dd</code> for a single message:</h3> <p>If it's simply a single message, then you can indeed leverage <code>protoc --decode_raw</code>, but you need to strip off the length-prefix header first. Assuming the header is 4 bytes long you can use <code>dd</code> to strip the header off of <code>input</code> and then feed the output into <code>protoc</code>.</p> <pre class="prettyprint"><code>dd bs=1 skip=4 if=input 2>/dev/null | protoc --decode_raw </code></pre> <h3> <code>protoc-decode-lenprefix --decode_raw</code> for a single message:</h3> <p>I also wrote a script that handles the header stripping automatically:</p> <pre class="prettyprint"><code>protoc-decode-lenprefix --decode_raw < input </code></pre> <p>This script is simply a wrapper on top of <code>protoc --decode_raw</code>, but handles parsing out the length-prefix header and then invoking <code>protoc</code>.</p> <p>Now, this script isn't terribly useful in this case, because we can just use the <code>dd</code> trick above to strip the header off. However, say we have a data stream (e.g., a file or TCP stream) containing multiple messages that are framed with length-prefix headers....</p> <h3> <code>protoc-decode-lenprefix --decode_raw</code> for a stream of messages:</h3> <p>Instead of a single protobuf message in the input file, let's say <code>input</code> contained multiple protobuf messages which are framed by length-prefix headers. In this case it's not possible to <em>just</em> use the <code>dd</code> trick, because you need to actually read the contents of the length-prefix header to determine how long the subsequent message in the stream is, and thus how many bytes ahead the next header+message lies. So instead of worrying about all of that, you can simply use <code>protoc-decode-lenprefix</code> again:</p> <pre class="prettyprint"><code>protoc-decode-lenprefix --decode_raw < input </code></pre> <h3> <code>protoc-decode-lenprefix --decode ... foo.proto</code> for a stream of messages</h3> <p>This script can also be used to fully decode length-prefixed messages (instead of just "raw decode" them). It assumes you have access to the <code>.proto</code> files that define the protobuf message, just like the wrapped <code>protoc</code> command. The invocation syntax is identical to <code>protoc --decode</code>. For example, using the <code>dd</code> trick with <code>protoc --decode</code>, along with input being a Mesos task.info file, the syntax looks like this:</p> <pre class="prettyprint"><code>dd bs=1 skip=4 if=task.info 2>/dev/null | \ protoc --decode mesos.internal.Task \ -I MESOS_CODE/src -I MESOS_CODE/include \ MESOS_CODE/src/messages/messages.proto </code></pre> <p>And the parameters are identical when using <code>protoc-decode-lenprefix</code></p> <pre class="prettyprint"><code>cat task.info | \ protoc-decode-lenprefix --decode mesos.internal.Task \ -I MESOS_CODE/src -I MESOS_CODE/include \ MESOS_CODE/src/messages/messages.proto </code></pre>

raw decoder for protobufs format

Tags:

protocol-buffers

I'd like to find a way to convert a binary protobuf message into a human readable description of the contained data, without using the .proto files.

The background is that I have a .proto message that it being rejected by the parser on Android, but it's not entirely clear why. I could go through the message by hand, but it's rather tedious.

I tried protoc --decode_raw, but it just gives the error "Failed to parse input.". I google hoping/expecting someone would have done a nice web utility that might do this, but haven't found anything obvious.

I'm just hoping to get some output like:

field 1: varint: 128 field 4: string: "foo"

Any pointers in the right direction would be most welcome!

824

asked Sep 08 '11 06:09

JosephH

2 Answers

For Posterity: Google's protocol buffer tools have the ability to decode raw buffers.

Just send the unknown buffer to it and pass the--decode_raw flag

$ protoc --decode_raw < has_no_proto.buff 2 {   2: "Error retrieving information from server. [RH-02]" }

So here's a message with field 2 set to an embedded message which in turn has its second field set to a string telling me I pissed off Google Play.

Type information isn't definite (it looks like it will try to display all binary data as strings -- but your requirement for varint/string/submessage distinction is met).

146

answered Sep 27 '22 21:09

anq

As noted in Michel de Ruiter's answer, it's possible that your protobuf message has a length-prefix. Assuming it does, this answer should help.

(NOTE: For most of the commands below, I'm assuming your protobuf message is stored in a file called input.)

`protoc --decode_raw` + `dd` for a single message:

If it's simply a single message, then you can indeed leverage protoc --decode_raw, but you need to strip off the length-prefix header first. Assuming the header is 4 bytes long you can use dd to strip the header off of input and then feed the output into protoc.

dd bs=1 skip=4 if=input 2>/dev/null | protoc --decode_raw

`protoc-decode-lenprefix --decode_raw` for a single message:

I also wrote a script that handles the header stripping automatically:

protoc-decode-lenprefix --decode_raw < input

This script is simply a wrapper on top of protoc --decode_raw, but handles parsing out the length-prefix header and then invoking protoc.

Now, this script isn't terribly useful in this case, because we can just use the dd trick above to strip the header off. However, say we have a data stream (e.g., a file or TCP stream) containing multiple messages that are framed with length-prefix headers....

`protoc-decode-lenprefix --decode_raw` for a stream of messages:

Instead of a single protobuf message in the input file, let's say input contained multiple protobuf messages which are framed by length-prefix headers. In this case it's not possible to just use the dd trick, because you need to actually read the contents of the length-prefix header to determine how long the subsequent message in the stream is, and thus how many bytes ahead the next header+message lies. So instead of worrying about all of that, you can simply use protoc-decode-lenprefix again:

protoc-decode-lenprefix --decode_raw < input

`protoc-decode-lenprefix --decode ... foo.proto` for a stream of messages

This script can also be used to fully decode length-prefixed messages (instead of just "raw decode" them). It assumes you have access to the .proto files that define the protobuf message, just like the wrapped protoc command. The invocation syntax is identical to protoc --decode. For example, using the dd trick with protoc --decode, along with input being a Mesos task.info file, the syntax looks like this:

dd bs=1 skip=4 if=task.info 2>/dev/null | \ protoc --decode mesos.internal.Task \                       -I MESOS_CODE/src -I MESOS_CODE/include \                       MESOS_CODE/src/messages/messages.proto

And the parameters are identical when using protoc-decode-lenprefix

cat task.info | \ protoc-decode-lenprefix --decode mesos.internal.Task \                       -I MESOS_CODE/src -I MESOS_CODE/include \                       MESOS_CODE/src/messages/messages.proto

answered Sep 27 '22 19:09

erik.weathers

Related questions
                            
                                Is it possible to mock a Java protocol buffer message?
                            
                                How to get protobuf.js to output enum strings instead of integers
                            
                                How to use a forked module, with versioned Go Modules (v1.11+, GO111MODULE=on)
                            
                                How can I represent a 2-dimensional array in Protocol Buffers?
                            
                                What's the best way to represent System.Decimal in Protocol Buffers?
                            
                                Protocol Buffer Error on compile during GOOGLE_PROTOBUF_MIN_PROTOC_VERSION check
                            
                                What's the reason behind ZigZag encoding in Protocol Buffers and Avro?
                            
                                Where to store proto files which are shared among projects?
                            
                                How can I remove an item from a repeated protobuf field in python?
                            
                                Check if a field has been set in protocol buffer 3
                            
                                What does the ProtoInclude attribute mean (in protobuf-net)
                            
                                protoc: command not found (Linux)
                            
                                Is there a good C implementation of Google Protocol Buffers
                            
                                Google protocol buffers on iOS
                            
                                Importing caffe results in ImportError: "No module named google.protobuf.internal" (import enum_type_wrapper)
                            
                                protocol buffers - store an double array, 1D, 2D and 3D
                            
                                How to solve the issue with Dalvik compiler limitation on 64K methods?
                            
                                Examining a protobuf message - how to get field values by name?
                            
                                Can protobuf service method return primitive type?
                            
                                Correct format of protoc go_package?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

raw decoder for protobufs format

Tags:

protocol-buffers

JosephH

People also ask

2 Answers

anq

`protoc --decode_raw` + `dd` for a single message:

`protoc-decode-lenprefix --decode_raw` for a single message:

`protoc-decode-lenprefix --decode_raw` for a stream of messages:

`protoc-decode-lenprefix --decode ... foo.proto` for a stream of messages

erik.weathers

Recent Activity

Donate For Us

raw decoder for protobufs format

Tags:

protocol-buffers

JosephH

People also ask

2 Answers

anq

protoc --decode_raw + dd for a single message:

protoc-decode-lenprefix --decode_raw for a single message:

protoc-decode-lenprefix --decode_raw for a stream of messages:

protoc-decode-lenprefix --decode ... foo.proto for a stream of messages

erik.weathers

Related questions

Recent Activity

Donate For Us

`protoc --decode_raw` + `dd` for a single message:

`protoc-decode-lenprefix --decode_raw` for a single message:

`protoc-decode-lenprefix --decode_raw` for a stream of messages:

`protoc-decode-lenprefix --decode ... foo.proto` for a stream of messages