Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What defines the capacity of a memory stream

I was calculating the size of object(a List that is being Populated), using the following code:

 long myObjectSize = 0;
 System.IO.MemoryStream memoryStreamObject = new System.IO.MemoryStream();
 System.Runtime.Serialization.Formatters.Binary.BinaryFormatter binaryBuffer = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
 binaryBuffer.Serialize(memoryStreamObject, myListObject);
 myObjectSize = memoryStreamObject.Position;

A the initial point the capacity of the memoryStreamObject was 1024 enter image description here

Later(after adding more elements to the list) It was shown as 2048. enter image description here

And It seems to be increasing as the stream content increasing. Then what is the purpose of capacity in this scenario.?

like image 310
sujith karivelil Avatar asked Oct 07 '15 06:10

sujith karivelil


People also ask

What is a memory stream?

MemoryStream encapsulates data stored as an unsigned byte array. The encapsulated data is directly accessible in memory. Memory streams can reduce the need for temporary buffers and files in an application. The current position of a stream is the position at which the next read or write operation takes place.

What is the difference between memory stream and stream?

You would use the FileStream to read/write a file but a MemoryStream to read/write in-memory data, such as a byte array decoded from a string.

What is memory stream buffer?

MemoryStream is a buffer for the whole stream - it isn't chained to another one. You can ask it to write itself to another stream at any time, but that's not the same thing.

What is byte stream in C#?

Byte streams comprise classes that treat data in the stream as bytes. These streams are most useful when you work with data that is not in a format readable by humans. Stream Class. In the CLR, the Stream class provides the base for other byte stream classes.


2 Answers

This is caused by the internal implementation of the MemoryStream. The Capacity property is the size of the internal buffer. This make sense if the MemoryStream is created with a fixed size buffer. But in your case the MemoryStream can grow and the actual implementation doubles the size of the buffer if the buffer is too small.

Code of MemoryStream

private bool EnsureCapacity(int value)
{
if (value < 0)
{
    throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
}
if (value > this._capacity)
{
    int num = value;
    if (num < 256)
    {
        num = 256;
    }
    if (num < this._capacity * 2)
    {
        num = this._capacity * 2;
    }
    if (this._capacity * 2 > 2147483591)
    {
        num = ((value > 2147483591) ? value : 2147483591);
    }
    this.Capacity = num;
    return true;
    }
  return false;
}

And somewhere in Write

int num = this._position + count;
// snip
if (num > this._capacity && this.EnsureCapacity(num))
like image 200
Manuel Amstutz Avatar answered Oct 17 '22 07:10

Manuel Amstutz


The purpose of the capacity for a memory stream and for lists is that the underlying data structure is really an array, and arrays cannot be dynamically resized.

So to start with you use an array with a small(ish) size, but once you add enough data so that the array is no longer big enough you need to create a new array, copy over all the data from the old array to the new array, and then switch to using the new array from now on.

This create+copy takes time, the bigger the array, the longer the time it takes to do this. As such, if you resized the array just big enough every time, you would effectively do this every time you write to the memory stream or add a new element to the list.

Instead you have a capacity, saying "you can use up to this value before having to resize" to reduce the number of create+copy cycles you have to perform.

For instance, if you were to write one byte at a time to this array, and not have this capacity concept, every extra byte would mean one full cycle of create+copy of the entire array. Instead, with the last screenshot in your question, you can write one byte at a time, 520 more times before this create+copy cycle has to be performed.

So this is a performance optimization.

An additional bonus is that repeatedly allocating slightly larger memory blocks will eventually fragment memory so that you would risk getting "out of memory" exceptions, reducing the number of such allocations also helps to stave off this scenario.

A typical method to calculate this capacity is by just doubling it every time.

like image 42
Lasse V. Karlsen Avatar answered Oct 17 '22 07:10

Lasse V. Karlsen