Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Protobuf-net object reference deserialization using Dictionary: A reference-tracked object changed reference during deserialization

I'm having some issues trying to serialize/deserialize a complex object graph using protobuf-net.

I'm working on a legacy application and we're using .Net Remoting to connect a GUI client to a C# service. We are seeing poor performance with overseas users due to the serialized size of our object graphs using the default BinaryFormatter, which is exacerbated by the limited bandwidth in-between the client and server (1Mbit/s).

As a quick win, I thought I'd put together a proof of concept to see if there were any performance gains to be had by using protobuf-net instead, by implementing ISerializable. As I was testing I ran into an issue whereby object references weren't being maintained.

I've put together an example which repros the issue. I'm expecting that the object in the Dictionary (Items[1]) and the object B.A will be the same as I've specified AsReference=true in the ProtoMember attribute.

Using protobuf-net 2.0.0.619, I'm seeing an exception thrown when deserializing (A reference-tracked object changed reference during deserialization).

If this isn't a supported scenario the please let me know.

Test

[Test]
public void AreObjectReferencesSameAfterDeserialization()
{
    A a = new A();
    B b = new B();

    b.A = a;

    b.Items.Add(1, a);

    Assert.AreSame(a, b.A);
    Assert.AreSame(b.A, b.Items[1]);

    B deserializedB;

    using (var stream = new MemoryStream())
    {
        Serializer.Serialize(stream, b);
        stream.Seek(0, SeekOrigin.Begin);
        deserializedB = Serializer.Deserialize<B>(stream);
    }

    Assert.AreSame(deserializedB.A, deserializedB.Items[1]);
}

Class definitions

[Serializable]
[ProtoContract]
public class A
{
}

[Serializable]
[ProtoContract]
public class B
{
    [ProtoMember(1, AsReference = true)]
    public A A { get; set; }

    [ProtoMember(2, AsReference = true)]
    public Dictionary<int, A> Items { get; set; }

    public B()
    {
        Items = new Dictionary<int, A>();
    }
}
like image 384
Lee F Avatar asked Jan 21 '13 10:01

Lee F


1 Answers

Edit: this should work from the next build onwards simply by marking the type's AsReferenceDefault:

[ProtoContract(AsReferenceDefault=true)]
public class A
{
    // ...
}

At the current time this is sort of an unsupported scenario - at least, via the attributes it is unsupported; basically, the AsReference=true currently is referring to the KeyValuePair<int,A>, which doesn't really make sense since KeyValuePair<int,A> is a value-type (so this can never be treated as a reference; I've added a better message for that in my local copy).

Because KeyValuePair<int,A> acts (by default) as a tuple, there is currently nowhere to support the AsReference information, but that is a scenario I would like to support better, and I will be investigating this.

There was also a bug that meant that AsReference on tuples (even reference-type tuples) was getting out-of-order, but I've fixed that locally; this was where the "changed" message came from.

In theory, the work for me to do this isn't huge; the fundamentals already work, and oddly enough it came up separately on twitter last night too - I guess "dictionary pointing to an object" is a very common scenario. At a guess, I imagince I'll add some atribute to help describe this situation, but you can actually hack around it at the moment using a couple of different routes:

1: configure KeyValuePair<int,A> manually:

[Test]
public void ExecuteHackedViaFields()
{
    // I'm using separate models **only** to keep them clean between tests;
    // normally you would use RuntimeTypeModel.Default
    var model = TypeModel.Create();

    // configure using the fields of KeyValuePair<int,A>
    var type = model.Add(typeof(KeyValuePair<int, A>), false);
    type.Add(1, "key");
    type.AddField(2, "value").AsReference = true;

     // or just remove AsReference on Items
    model[typeof(B)][2].AsReference = false;

    Execute(model);
}

I don't like this much, because it exploits implementation details of KeyValuePair<,> (the private fields), and may not work between .NET versions. I would prefer to replace KeyValuePair<,> on the fly via a surrogate:

[Test]
public void ExecuteHackedViaSurrogate()
{
    // I'm using separate models **only** to keep them clean between tests;
    // normally you would use RuntimeTypeModel.Default
    var model = TypeModel.Create();

    // or just remove AsReference on Items
    model[typeof(B)][2].AsReference = false;

    // this is the evil bit: configure a surrogate for KeyValuePair<int,A>
    model[typeof(KeyValuePair<int, A>)].SetSurrogate(typeof(RefPair<int, A>));
    Execute(model);
}

[ProtoContract]
public struct RefPair<TKey,TValue> {
    [ProtoMember(1)]
    public TKey Key {get; private set;}
    [ProtoMember(2, AsReference = true)]
    public TValue Value {get; private set;}
    public RefPair(TKey key, TValue value) : this() {
        Key = key;
        Value = value;
    }
    public static implicit operator KeyValuePair<TKey,TValue>
        (RefPair<TKey,TValue> val)
    {
        return new KeyValuePair<TKey,TValue>(val.Key, val.Value);
    }
    public static implicit operator RefPair<TKey,TValue>
        (KeyValuePair<TKey,TValue> val)
    {
        return new RefPair<TKey,TValue>(val.Key, val.Value);
    }
}

This configures something to use instead of KeyValuePair<int,A> (converted via the operators).

In both of these, Execute is just:

private void Execute(TypeModel model)
{
    A a = new A();
    B b = new B();

    b.A = a;

    b.Items.Add(1, a);

    Assert.AreSame(a, b.A);
    Assert.AreSame(b.A, b.Items[1]);

    B deserializedB = (B)model.DeepClone(b);

    Assert.AreSame(deserializedB.A, deserializedB.Items[1]);
}

I do, however, want to add direct support. The good thing about both of the above is that when I get time to do that, you just have to remove the custom configuration code.

For completeness, if your code is using Serializer.* methods, then rather than create / configure a new model, you should configure the default model:

RuntimeTypeModel.Default.Add(...); // etc

Serializer.* is basically a short-cut to RuntimeTypeModel.Default.*.

Finally: you should not create a new TypeModel per call; that would hurt prerformance. You should create and configure one model instance, and re-use it lots. Or just use the default model.

like image 128
Marc Gravell Avatar answered Sep 17 '22 12:09

Marc Gravell