Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protobuf-net is incompatible with official google Protobuf for C++ (message encoding)

We had some (lots of) classes in .NET. We used protobuf-net to mark them up, and generate .proto wrappers for C++ code side via google original library.

So I have a message (C++ DebugString() on some EventBase class (in .NET EventCharacterMoved inherits EventBase while in C++ I just write to optional property)):

UserId: -2792
EventCharacterMoved {
  Coordinates {
    Position {
      X: 196.41913
      Y: 130
      Z: 213
    }
    Rotation {
      X: 207
      Y: 130
      Z: 213
    }
  }
  OldCoordinates {
    Position {
      X: 196.41913
      Y: 130
      Z: 213
    }
    Rotation {
      X: 207
      Y: 130
      Z: 213
    }
  }
}

(From such .proto file)

message Coordinates {
   optional TreeFloat Position = 1;
   optional TreeFloat Rotation = 2;
}
message EventBase {
   optional int32 UserId = 10 [default = 0];
   // the following represent sub-types; at most 1 should have a value
   optional EventCharacterMoved EventCharacterMoved = 15;
}
message EventCharacterMoved {
   optional Coordinates Coordinates = 100;
   optional Coordinates OldCoordinates = 101;
}
message TreeFloat {
   optional float X = 1 [default = 0];
   optional float Y = 2 [default = 0];
   optional float Z = 3 [default = 0];
}

In C++ I send this and we send the same message contents from .NET.

The C++ code can parse C++ encoded message as well as the .NET encoded one. The .NET code can only parse the .NET message.

Over the wire we get 87 bytes flying (same size from .Net file and C++ file) yet contents are different:

enter image description here

As you can see its similar yet not same. As a result of such difference CPP code can read .NET C# messages while .NET can not read CPP messages.

In code on deserialization we get:

An unhandled exception of type 'System.InvalidCastException' occurred in TestProto.exe

Additional information: Unable to cast object of type 'TestProto.EventBase' to type 'TestProto.EventCharacterMoved'.

in code like:

using (var inputStream = File.Open(@"./cpp_in.bin", FileMode.Open, FileAccess.Read)) {
    var ecm = Serializer.Deserialize<EventCharacterMoved>(inputStream);
}

Let's look at (as mentioned by jpa in his comment) protoc --decode_raw option:

enter image description here

This can be related to the fact that my CPP wrapper uses latest google protobuf version while protobuf-net probably uses some older encoding format or something like this...

So I wonder how to make .NET protobuf read C++ messages (make tham capable of decoding same stuff)?

Or at least how to make original google protobuf encode same way .NET protobuf does?

And for those who are really interested and would like to get into it zipped bundle with simplified example (VS 2010 solutions for C++ and C# code included)

like image 537
myWallJSON Avatar asked Dec 24 '12 10:12

myWallJSON


People also ask

Is protobuf backwards compatible?

Protocol buffers provide a language-neutral, platform-neutral, extensible mechanism for serializing structured data in a forward-compatible and backward-compatible way.

How do I change protobuf version?

One should delete everything regarding google protocol buffers under /local and /include , , and then simple reinstall the other version. After that protoc --version shows the new version. Hope it helps someone: I just installed proto3(3.5. 1) based on github.com/google/protobuf/blob/master/README.md.

What encoding is protobuf?

Protobuf strings are always valid UTF-8 strings. See the Language Guide: A string must always contain UTF-8 encoded or 7-bit ASCII text. (And ASCII is always also valid UTF-8.)

Does Google use protobuf?

Protocol buffers, or Protobuf, is a binary format created by Google to serialize data between different services. Google made this protocol open source and now it provides support, out of the box, to the most common languages, like JavaScript, Java, C#, Ruby and others.


2 Answers

Edit; this should be fixed in r616 and above.


I've finally had chance to look at this (apologies for delay, but social seasonal holiday demands intervened). I understand what is happening now.

Basically, the data is theoretically identical; what this actually comes down to is field-ordering. Technically, fields are usually written in ascending order, but can be expected in any order. With regards to protobuf-net; for types that don't involve inheritance it will work fine regardless of order. The protobuf specification does not define inheritance, so protobuf-net adds support for that (due to constant demand) additionally to the specification. As an implementation feature, it writes the sub-class information first (i.e. field 15, the sub-type, is written ahead of field 10). At the current time, during deserialization it also expects the sub-type information first. This has rarely impacted anyone, because since protobuf-net is the only implementation that uses inheritance like this, use of the inheritance feature is mostly only seen with protobuf-net to protobuf-net usage.

In your case, you're using .proto to interop with CPP; which means the CPP code will be able to consume to protobuf-net data, but it may have a type-cast exception going the other way (basically, it starts constructing the concrete type at the time it gets the first data field).

Despite rarely being an issue, this is something that needs fixing. I can try to look at this later today or tomorrow.

Options:

  • make sure the sub-type fields are always lower than any data fields
  • if you know it is expecting the sub-type, use the Merge API and pass in an existing new object of the desired type - this will then populate the existing object correctly
  • wait a day or two (hopefully!) use build r616 or above for a proper fix
  • avoid inheritance (and other implementation-specific features) when using interop
    • note you can model the same data without inheritance, via encapsulation - and it will work happily; it is specifically the creation of the concrete type that is the issue here
  • go to unreasonable lengths (meaning: I don't consider this an actual solution) when constructing the data from the CPP site, by writing it in two pieces:
    • write an EventBase with just the EventCharacterMoved data first, and serialize; now in a separate model write an EventBase with just the TreeFloat data, and serialize; this will simulate writing them in the required order (protobuf streams are appendable) - not pretty
like image 71
Marc Gravell Avatar answered Oct 28 '22 11:10

Marc Gravell


This looks pretty similar to the problems noted in http://code.google.com/p/protobuf-net/issues/detail?id=299 and http://code.google.com/p/protobuf-net/issues/detail?id=331 which were allegedly fixed by http://code.google.com/p/protobuf-net/source/detail?r=595

Is the version of .NET protobuf you're using new enough to have incorporated that fix?

like image 20
rici Avatar answered Oct 28 '22 11:10

rici