Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is writing to a MemoryStream slower than to a file?

In my Azure role code I download a 400 megabytes file that is splitted into 10-megabyte chunks and stored in Blob Storage. I use CloudBlob.DownloadToStream() for the download.

I tried two options. One is using a FileStream - I create a "write" FileStream and download chunks one by one into the same stream without rewinding and so I end up with an original file. The other option is creating a MemoryStream object by passing a number slightly larger than the original file size as the stream size (to avoid reallocations) and downloading the chunks into that MemoryStream - this way I end up with a MemoryStream holding the original file data.

Here's some pseudocode:

var writeStream = new StreamOfChoice( params );
foreach( uri in urisToDownload ) {
    blobContainer.GetBlobReference( uri ).DownloadToStream( writeStream );
}

Now the only difference is that it's a FileStream in one case and a MemoryStream in the other, all the rest is the same. It turns out that it takes about 20 seconds with a FileStream and about 30 seconds with a MemoryStream - yes, the FileStream turns out to be faster. According to \Memory\Available Bytes performance counter the virtual machine has about 1 gigabyte memory available at the moment before MemoryStream is created, so it's not due to paging.

Why would writing to a file be faster than to a MemoryStream?

like image 960
sharptooth Avatar asked Aug 17 '12 13:08

sharptooth


1 Answers

Jon is probably on the ball there. The most likely explanation is,

  1. The memory is actually paged out by the hypervisor to disk.
  2. The hypervisor swap file is on a lower speed disk (say local disk).
  3. The FileSystem of the VM is on a fast enterprise disk (say SAN).

Regardless of whether memory is quicker or not, you really shouldn't allocate out such large blocks of memory. Have a read about LOH vs SOH here.

like image 84
M Afifi Avatar answered Nov 15 '22 18:11

M Afifi