Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple data file versioning with DataContractSerializer

Having read Data Contract Versioning we concluded that it's not really the whole story. For example, what happens if you used to have ValueA, and in the new version it's now called ValueB and is of a different type, and you need to convert ValueA to ValueB?

There are some callbacks I could use to help with this, but it doesn't look like a very maintainable solution if we expect the format to change frequently over a long period of time.

The solution we settled for is to keep a "saved by version" field, and upon loading the file invoke conversion routines specific to older versions as required. These conversion routines know how to convert XML for older data to XML for newer data.

However, as it turns out, DataContractSerializes requires the order of the elements to be exactly what it expects. This means our conversion process must know to insert elements into exactly the correct location. This is a lot harder than simply adding an element with a known name, if you take inheritance into account. With inheritance, you can't reliably AddBeforeSelf or AddAfterSelf any field, simply because there isn't a single field that is always next to this new field.

Leaving aside the reasons why DataContractSerializer was made so strict, can you please suggest ways around this? Perhaps a great article on how to remain backwards-compatible with very old data contracts, that doesn't become unwieldy at the point where you made the 100th breaking change to the format.

There are some extra guidelines in this article, but this must have been written for a different purpose. There is for example no way we can leave old data members hanging around forever (point 9). It appears that most such articles are written from a communication protocol point of view, rather than storing data in a file.

like image 662
Roman Starkov Avatar asked Oct 03 '09 12:10

Roman Starkov


2 Answers

1 year later I have to say that DataContractSerializer has really sucked for versioning. It's far too rigid. It's really meant for contracts that aren't very likely to change, and then only in specific ways. You have to do extra work to use it just to make it fast - like the KnownTypeAttribute for example. I would only recommend it if you require relatively fast serialization - which, arguably, is rather important for what it was designed for.

Another project I work on uses a more flexible serializer which, for example, doesn't skip calling the class constructor (something has has caused much inconvenience), and doesn't require items to be in a specific order. It deals gracefully with new fields (they are left at whatever the constructor set them to) and removed fields with zero programmer intervention.

Now if only I could post it here... It is however about 5x-10x slower than the DataContractSerializer.

like image 186
Roman Starkov Avatar answered Sep 17 '22 10:09

Roman Starkov


I think you're expecting too much from the built-in versioning support. It's really intended to allow you to add new members while retaining all existing functionality and therefore members.

In the case of breaking changes to a contract, you'd probably be better creating a new version of the contract (e.g. using a new namespace - a common convention is to use a suffix yyyy/mm, e.g. http://mycompany.com/myservices/2009/10).

You then need to be able to support as many old contracts as is appropriate, and need to be able to convert between each supported contract and whatever current internal representation you are using.

like image 24
Joe Avatar answered Sep 20 '22 10:09

Joe