Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most efficient way of reading a BinaryFormatter serialized object from a NetworkStream?

I have an app that is sending serializable objects of varying sizes over a socket connection, and I'd like it to be as scalable as possible. There could also be dozens to even hundreds of connections.

  1. The NetworkStream is coming from a TcpClient that is continuously listening for incoming messages.
  2. I don't want to block a thread with the standard NetworkStream.Read(). This needs to scale. I'm only assuming that Read() blocks, because that's pretty standard behavior for this sort of class, and there's a ReadTimeout property on the class.
  3. I'm not sure if BinaryFormatter just uses Read() or if it does some of the Async stuff for me under the hood. My guess is no.
  4. The TcpClient needs to get a message, read it to the end, then go back to listening for messages.

So it seems like there are too many ways to skin this cat, and I'm not sure what is really going to be the most efficient. Do I:

Simply use the BinaryFormatter to read the NetworkStream?

var netStream = client.GetStream();
var formatter = new BinaryFormatter();
var obj = formatter.Deserialize(netStream);

OR Do some magic with the new async/await stuff:

using(var ms = new MemoryStream()) 
{
   var netStream = client.GetStream();
   var buffer = new byte[1028];
   int bytesRead;
   while((bytesRead = await netStream.ReadAsync(buffer, 0, buffer.length)) > 0) {
      ms.Write(buffer, 0, buffer.Length);
   }
   var formatter = new BinaryFormatter();
   var obj = formatter.Deserialize(ms);
}

OR Similar to the above, only leveraging the new CopyToAsync method:

using(var ms = new MemoryStream()) 
{
   var netStream = client.GetStream();
   await netStream.CopyToAsync(ms); //4096 default buffer.
   var formatter = new BinaryFormatter();
   var obj = formatter.Deserialize(ms);
}

OR Something else?

I'm looking for the answer that provides the most scalability/efficiency.

[Note: The above is all PSUEDO-code, given as examples]

like image 749
Ben Lesh Avatar asked Jan 08 '13 20:01

Ben Lesh


3 Answers

The first approach has got a problem with large streams. If you ever going to send large data, that code will blow the application with out of memory exception.

The second approach looks very good - it is asynchronous (meaning you don't use some valuable threads for waiting for read to complete) and it uses chunks of data (this is how you supposed to work with a stream).

So go for the second option, maybe with slight modification - deserialize only chunk of data at a time, don't read the whole thing (unless you absolutely sure about the stream length).

This is what I have in mind (pseudo-code)

using (var networkStream = client.GetStream()) //get access to stream
{
    while(!networkStream.EndOfStream) //still has some data
    {
        var buffer = new byte[1234]; //get a buffer
        await SourceStream.ReadAsync(result, 0, buffer); //read from network there

        //om nom nom buffer     
        Foo obj;
        using(var ms = new MemoryStream()) //process just one chunk
        {
             ms.Write(buffer, 0, buffer.Length);
             var formatter = new BinaryFormatter();
             obj = formatter.Deserialize(ms);   //desserialise the object        
        } // dispose memory

        //async send obj up for further processing
    }
}
like image 119
oleksii Avatar answered Nov 12 '22 17:11

oleksii


The async/await stuff will allow you to block threads less often when waiting on resources so in general it will scale better than thread blocking versions.

like image 27
Scott Stevens Avatar answered Nov 12 '22 18:11

Scott Stevens


Async will scale better if there are hundreds of concurrent operations running.

It will be slower serially, though. Async has overhead that is easily detected in benchmarks. Prefer using option 1 if you don't require option 2.

like image 43
usr Avatar answered Nov 12 '22 16:11

usr