In my Azure role code I download a 400 megabytes file that is splitted into 10-megabyte chunks and stored in Blob Storage. I use CloudBlob.DownloadToStream()
for the download.
I tried two options. One is using a FileStream
- I create a "write" FileStream
and download chunks one by one into the same stream without rewinding and so I end up with an original file. The other option is creating a MemoryStream
object by passing a number slightly larger than the original file size as the stream size (to avoid reallocations) and downloading the chunks into that MemoryStream
- this way I end up with a MemoryStream
holding the original file data.
Here's some pseudocode:
var writeStream = new StreamOfChoice( params );
foreach( uri in urisToDownload ) {
blobContainer.GetBlobReference( uri ).DownloadToStream( writeStream );
}
Now the only difference is that it's a FileStream
in one case and a MemoryStream
in the other, all the rest is the same. It turns out that it takes about 20 seconds with a FileStream
and about 30 seconds with a MemoryStream
- yes, the FileStream
turns out to be faster. According to \Memory\Available Bytes
performance counter the virtual machine has about 1 gigabyte memory available at the moment before MemoryStream
is created, so it's not due to paging.
Why would writing to a file be faster than to a MemoryStream
?
Jon is probably on the ball there. The most likely explanation is,
Regardless of whether memory is quicker or not, you really shouldn't allocate out such large blocks of memory. Have a read about LOH vs SOH here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With