Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Protobuf-net, I suddenly got an exception about an unknown wire-type

First thing to check:

IS THE INPUT DATA PROTOBUF DATA? If you try and parse another format (json, xml, csv, binary-formatter), or simply broken data (an "internal server error" html placeholder text page, for example), then it won't work.


What is a wire-type?

It is a 3-bit flag that tells it (in broad terms; it is only 3 bits after all) what the next data looks like.

Each field in protocol buffers is prefixed by a header that tells it which field (number) it represents, and what type of data is coming next; this "what type of data" is essential to support the case where unanticipated data is in the stream (for example, you've added fields to the data-type at one end), as it lets the serializer know how to read past that data (or store it for round-trip if required).

What are the different wire-type values and their description?

  • 0: variant-length integer (up to 64 bits) - base-128 encoded with the MSB indicating continuation (used as the default for integer types, including enums)
  • 1: 64-bit - 8 bytes of data (used for double, or electively for long/ulong)
  • 2: length-prefixed - first read an integer using variant-length encoding; this tells you how many bytes of data follow (used for strings, byte[], "packed" arrays, and as the default for child objects properties / lists)
  • 3: "start group" - an alternative mechanism for encoding child objects that uses start/end tags - largely deprecated by Google, it is more expensive to skip an entire child-object field since you can't just "seek" past an unexpected object
  • 4: "end group" - twinned with 3
  • 5: 32-bit - 4 bytes of data (used for float, or electively for int/uint and other small integer types)

I suspect a field is causing the problem, how to debug this?

Are you serializing to a file? The most likely cause (in my experience) is that you have overwritten an existing file, but have not truncated it; i.e. it was 200 bytes; you've re-written it, but with only 182 bytes. There are now 18 bytes of garbage on the end of your stream that is tripping it up. Files must be truncated when re-writing protocol buffers. You can do this with FileMode:

using(var file = new FileStream(path, FileMode.Truncate)) {
    // write
}

or alternatively by SetLength after writing your data:

file.SetLength(file.Position);

Other possible cause

You are (accidentally) deserializing a stream into a different type than what was serialized. It's worth double-checking both sides of the conversation to ensure this is not happening.


Since the stack trace references this StackOverflow question, I thought I'd point out that you can also receive this exception if you (accidentally) deserialize a stream into a different type than what was serialized. So it's worth double-checking both sides of the conversation to ensure this is not happening.


This can also be caused by an attempt to write more than one protobuf message to a single stream. The solution is to use SerializeWithLengthPrefix and DeserializeWithLengthPrefix.


Why this happens:

The protobuf specification supports a fairly small number of wire-types (the binary storage formats) and data-types (the .NET etc data-types). Additionally, this is not 1:1, nor is is 1:many or many:1 - a single wire-type can be used for multiple data-types, and a single data-type can be encoded via any of multiple wire-types. As a consequence, you cannot fully understand a protobuf fragment unless you already know the scema, so you know how to interpret each value. When you are, say, reading an Int32 data-type, the supported wire-types might be "varint", "fixed32" and "fixed64", where-as when reading a String data-type, the only supported wire-type is "string".

If there is no compatible map between the data-type and wire-type, then the data cannot be read, and this error is raised.

Now let's look at why this occurs in the scenario here:

[ProtoContract]
public class Data1
{
    [ProtoMember(1, IsRequired=true)]
    public int A { get; set; }
}

[ProtoContract]
public class Data2
{
    [ProtoMember(1, IsRequired = true)]
    public string B { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var d1 = new Data1 { A = 1};
        var d2 = new Data2 { B = "Hello" };
        var ms = new MemoryStream();
        Serializer.Serialize(ms, d1); 
        Serializer.Serialize(ms, d2);
        ms.Position = 0;
        var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
        var d4 = Serializer.Deserialize<Data2>(ms);
        Console.WriteLine("{0} {1}", d3, d4);
    }
}

In the above, two messages are written directly after each-other. The complication is: protobuf is an appendable format, with append meaning "merge". A protobuf message does not know its own length, so the default way of reading a message is: read until EOF. However, here we have appended two different types. If we read this back, it does not know when we have finished reading the first message, so it keeps reading. When it gets to data from the second message, we find ourselves reading a "string" wire-type, but we are still trying to populate a Data1 instance, for which member 1 is an Int32. There is no map between "string" and Int32, so it explodes.

The *WithLengthPrefix methods allow the serializer to know where each message finishes; so, if we serialize a Data1 and Data2 using the *WithLengthPrefix, then deserialize a Data1 and a Data2 using the *WithLengthPrefix methods, then it correctly splits the incoming data between the two instances, only reading the right value into the right object.

Additionally, when storing heterogeneous data like this, you might want to additionally assign (via *WithLengthPrefix) a different field-number to each class; this provides greater visibility of which type is being deserialized. There is also a method in Serializer.NonGeneric which can then be used to deserialize the data without needing to know in advance what we are deserializing:

// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;

var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
    PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
    Console.WriteLine(obj); // writes Data1 on the first iteration,
                            // and Data2 on the second iteration
}

Previous answers already explain the problem better than I can. I just want to add an even simpler way to reproduce the exception.

This error will also occur simply if the type of a serialized ProtoMember is different from the expected type during deserialization.

For instance if the client sends the following message:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Foo{ get; set; }
}

But what the server deserializes the message into is the following class:

public class DummyRequest
{
    [ProtoMember(1)]
    public string Foo{ get; set; }
}

Then this will result in the for this case slightly misleading error message

ProtoBuf.ProtoException: Invalid wire-type; this usually means you have over-written a file without truncating or setting the length

It will even occur if the property name changed. Let's say the client sent the following instead:

public class DummyRequest
{
    [ProtoMember(1)]
    public int Bar{ get; set; }
}

This will still cause the server to deserialize the int Bar to string Foo which causes the same ProtoBuf.ProtoException.

I hope this helps somebody debugging their application.


Also check the obvious that all your subclasses have [ProtoContract] attribute. Sometimes you can miss it when you have rich DTO.


I've seen this issue when using the improper Encoding type to convert the bytes in and out of strings.

Need to use Encoding.Default and not Encoding.UTF8.

using (var ms = new MemoryStream())
{
    Serializer.Serialize(ms, obj);
    var bytes = ms.ToArray();
    str = Encoding.Default.GetString(bytes);
}