Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is copying a stream and then deserializing using a BinaryFormatter faster than just deserializing

This code takes about 8 seconds with a stream containing about 65K coming from a blob in a database

private string[] GetArray(Stream stream)
{
    BinaryFormatter binaryFormatter = new BinaryFormatter();
    object result = binaryFormatter.Deserialize(stream);
    return (string[])result;
}

This code takes a few milliseconds:

private string[] GetArray(Stream stream)
{
    BinaryFormatter binaryFormatter = new BinaryFormatter();
    MemoryStream memoryStream = new MemoryStream();
    Copy(stream, memoryStream);
    memoryStream.Position = 0;
    object result = binaryFormatter.Deserialize(memoryStream);
    return (string[])result;
}

Why?

like image 283
Lars Peder Amlie Avatar asked Jan 09 '13 09:01

Lars Peder Amlie


People also ask

Why is BinaryFormatter insecure?

BinaryFormatter uses violates 2.), which is a huge security risk because it makes possible to run any code.

How does binary formatter work?

The class BinaryFormatter in C# performs the actions of “serialization” and “deserialization” of binary data. It takes simple data structures such as integers (int), decimal numbers (float), and collections of letters and numbers (string) and can convert them into a binary format.

Does deserialization create a new object in C#?

When you deserialize it, you don't have to construct anything. It just sticks the graph in memory at some pointer value as an object and lets you do whatever you want with it.


1 Answers

So you say the problem disappears when the database is taken out of the equation. Here is my theory:

BinaryFormatter reads from the stream in tiny increments. It has to read as little as possible so that it does not accidentally swallow a few bytes after the serialized object. That means it is issuing tons of read commands (I verified this with Reflector).

Probably, every read of the blob stream is causing a network roundtrip (or some other major overhead). That gives you millions of roundtrips if using BinaryFormatter right away.

Buffering first causes the network to be utilized more efficiently because the read buffer size is much bigger.

like image 119
usr Avatar answered Sep 22 '22 11:09

usr