Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determining the serialized size of a .NET type and unmanaged memory efficiency

My question is whether it is possible to determine the serialized size (in bytes) of a reference type.

Heres the situation:

I am using the BinaryFormatter class to serialize basic .NET types, ie for instance:

[Serializable]
public class Foo
{
    public string Foo1 { get; set; }
    public string Foo2 { get; set; } 
}

I am serializing each item to a byte[], then adding that segment to the end of an existing byte[] and additionally adding a carriage return at the end of each segment to delimit the objects.

In order to deserialize I use Marshal.ReadByte() as follows:

List<byte> buffer = new List<byte>();

for (int i = 0; i < MapSize; i++)
{
    byte b = Marshal.ReadByte(readPtr , i); 

    if (b != delim)  // read until encounter a carriage return 
        buffer.Add(b);
    else
        break;
}

readPtr = readPtr + buffer.Count + 1; // incrementing the pointer for the next object

return buffer.ToArray(); 

I believe that using Marshal.Copy() would be more efficient but I need to know the length of the serialized byte segment in advance. Is there a way I can reliably compute this from the type thats being serialized, or an overall more efficient method I can use?

Also, the use of a carriage return won't be reliable, ultimately. So I am wondering if there is a more standard way to delimit the objects, either through customizing my BinaryFormatter or using some other standardized best practice? For instance is there a specific way that the BinaryFormatter delimits objects if its serializing say, a generic List<>?

like image 291
Sean Thoman Avatar asked Nov 15 '25 12:11

Sean Thoman


2 Answers

There isn't a terribly good way to determine the serialized length beforehand. The specification for the BinaryFormatter protocol is available here: http://msdn.microsoft.com/en-us/library/cc236844(v=prot.10).aspx

I'll save you the trouble of reading it for your purposes:

  1. It's built to be an extensible format. This allows you to add fields later and still maintain some compatibility with earlier implementations. For your purposes, this means that the length of the serialized form is not fixed in time.
  2. It's extremely fragile. The binary format actually encodes the names of the fields in it. If you ever rename a field, the length of the serialized form will change.
  3. The binary format actually encompasses a many-to-one relationship between serialized encodings and object data. The same object could potentially be encoded in a number of different ways, with a number of different byte counts for the output (I won't get into why it's written that way).

If you want an easy way to do things, just create an array that contains all the objects and serialize that single array. This solves most of your problems. All the issues of delimiting the different objects are handled by the BinaryFormatter. You won't have excessive memory copying. The final output will be more compact because the BinaryFormatter only has to specify the field names once per invocation.

Finally, I can tell you that the extra memory copy is not the main source of inefficiency in your current implementation. You're getting far more inefficiency from the BinaryFormatter's use of reflection, and the fact that it encodes the field names in the serialized output.

If efficiency is paramount, then I would suggest writing some custom code that encodes the contents of your structures in "plain old data" format. Then you'll have control over how much gets written and how.

like image 189
Kennet Belenky Avatar answered Nov 18 '25 20:11

Kennet Belenky


Using a byte as delimiter for binary serialized data is awful idea - 13 is perfectly valid value that can be part of serialized data, not just your "delimiter".

Prefix each block with size in bytes instead and read it in blocks.

like image 41
Alexei Levenkov Avatar answered Nov 18 '25 19:11

Alexei Levenkov



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!