Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Protocol Buffers C++ implementation stability and security in the face of malicious data

For those who used Google Protocol Buffers C++ implementation, how does it deal with malicious or malformed messages? Does it crash or continues to operate for example? My app will certainly receive malicious data at some point and I don't want it to crash every time a malformed message is received. This is the only answer I could find on this issue (google mailing list).

There was a review specifically for security issues before the code was released. For at least the C++ and Java implementations, there are various safeguards to protect against corrupt or malicious data. There are limits on the overall message size provided by the protobuf library as well (CodedInputStream::SetTotalBytesLimit); it also provides a recursion limit to prevent deeply nested messages from blowing the stack. There are other internal implementation details to avoid things like memory exhaustion (most specifically from receiving messages that indicate a huge length-delimited value).

like image 924
Charles Avatar asked Jan 12 '15 16:01

Charles


People also ask

What are Google protocol buffers used for?

What are protocol buffers? Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.

What is Google Protobuf value?

WellKnownTypes. Value. Value represents a dynamically typed value which can be either null, a number, a string, a boolean, a recursive struct value, or a list of values.

What is protoc C?

This is protobuf-c, a C implementation of the Google Protocol Buffers data serialization format. It includes libprotobuf-c , a pure C library that implements protobuf encoding and decoding, and protoc-c , a code generator that converts Protocol Buffer . proto files to C descriptor code.


1 Answers

I use c++ google protocol buffers in a very security-conscious web-facing application.

Looking at the generated code, all deserialisation work is delegated to the auto-generated code in each message's <Message-Type>::MergePartialFromCodedStream method. These methods are generated with comprehensive checks on data types and lengths and we've had no problem so far.

One area of attack you might want to close down yourself is in the framing of protobuf data - protocol buffers themselves do not serialise the overall size of the serialised message to the stream in any kind of standardised header, so you may want to (as I do) wrap all protocol buffer messages in a frame. For my purposes the frame header simply contains a message size, which means I am able to determine the memory requirements of the message prior to attempting to read it off the wire, let alone decode it.

A simple check could be made at this point to reject messages (or drop the connection) if the size is unfeasibly large.

Further work can be done to wrap this frame in a public key enveloping scheme in order to prevent man-in-the-middle hijacking of your session if that is a concern.

Buffer overruns within a message (for example a string getting too long) cannot happen because bytes and string fields are internally represented by std::string, which automatically grows its memory footprint as data is appended to it.

However:

There is no guarantee that malicious clients will not seek to encode valid messages that contain invalid data. For example, if your server application takes a method name from data string, looks up its address and calls it then this is an obvious vector for attack.

You should never allow client data to find server code without a comprehensive check that the operation is specifically allowed.

Some examples of this that one must never do:

  1. allow the client to send you SQL in a text field
  2. allow the client to send you command-lines which you subsequently pass to system(), exec(), spawn() etc...
  3. allow the client to send you the name of a shared library and a function name within it...

and so on.

like image 72
Richard Hodges Avatar answered Nov 05 '22 12:11

Richard Hodges