Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the max size of byte[] 2 GB - 57 B?

People also ask

What is the max size of byte array?

The maximum number of elements in an array is UInt32. MaxValue. The maximum index in any single dimension is 2,147,483,591 (0x7FFFFFC7) for byte arrays and arrays of single-byte structures, and 2,146,435,071 (0X7FEFFFFF) for other types.

What is the byte size of an array?

If we want to determine the size of array, means how many elements present in the array, we have to write calculation with the help of sizeof() operator. Sizeof(arr[]) / sizeof(arr[0]) ; Here, the size of arr[] is 5 and each float takes memory 8 bytes. So, the total memory is consumed = (5 * 8) bytes.

What is the maximum size of byte array in Java?

Max Size It generally depends on the JVM that we're using and the platform. Since the index of the array is int, the approximate index value can be 2^31 – 1. Based on this approximation, we can say that the array can theoretically hold 2,147,483,647 elements.


You need 56 bytes of overhead. It is actually 2,147,483,649-1 minus 56 for the maximum size. That's why your minus 57 works and minus 56 does not.

As Jon Skeet says here:

However, in practical terms, I don't believe any implementation supports such huge arrays. The CLR has a per-object limit a bit short of 2GB, so even a byte array can't actually have 2147483648 elements. A bit of experimentation shows that on my box, the largest array you can create is new byte[2147483591]. (That's on the 64 bit .NET CLR; the version of Mono I've got installed chokes on that.)

See also this InformIT article on the same subject. It provides the following code to demonstrate the maximum sizes and overhead:

class Program
{
  static void Main(string[] args)
  {
    AllocateMaxSize<byte>();
    AllocateMaxSize<short>();
    AllocateMaxSize<int>();
    AllocateMaxSize<long>();
    AllocateMaxSize<object>();
  }

  const long twogigLimit = ((long)2 * 1024 * 1024 * 1024) - 1;
  static void AllocateMaxSize<T>()
  {
    int twogig = (int)twogigLimit;
    int num;
    Type tt = typeof(T);
    if (tt.IsValueType)
    {
      num = twogig / Marshal.SizeOf(typeof(T));
    }
    else
    {
      num = twogig / IntPtr.Size;
    }

    T[] buff;
    bool success = false;
    do
    {
      try
      {
        buff = new T[num];
        success = true;
      }
      catch (OutOfMemoryException)
      {
        --num;
      }
    } while (!success);
    Console.WriteLine("Maximum size of {0}[] is {1:N0} items.", typeof(T).ToString(), num);
  }
}

Finally, the article has this to say:

If you do the math, you’ll see that the overhead for allocating an array is 56 bytes. There are some bytes left over at the end due to object sizes. For example, 268,435,448 64-bit numbers occupy 2,147,483,584 bytes. Adding the 56 byte array overhead gives you 2,147,483,640, leaving you 7 bytes short of 2 gigabytes.

Edit:

But wait, there's more!

Looking around and talking with Jon Skeet, he pointed me to an article he wrote on Of memory and strings. In that article he provides a table of sizes:

Type            x86 size            x64 size
object          12                  24
object[]        16 + length * 4     32 + length * 8
int[]           12 + length * 4     28 + length * 4
byte[]          12 + length         24 + length
string          14 + length * 2     26 + length * 2

Mr. Skeet goes on to say:

You might be forgiven for looking at the numbers above and thinking that the "overhead" of an object is 12 bytes in x86 and 24 in x64... but that's not quite right.

and this:

  • There's a "base" overhead of 8 bytes per object in x86 and 16 per object in x64... given that we can store an Int32 of "real" data in x86 and still have an object size of 12, and likewise we can store two Int32s of real data in x64 and still have an object of x64.

  • There's a "minimum" size of 12 bytes and 24 bytes respectively. In other words, you can't have a type which is just the overhead. Note how the "Empty" class takes up the same size as creating instances of Object... there's effectively some spare room, because the CLR doesn't like operating on an object with no data. (Note that a struct with no fields takes up space too, even for local variables.)

  • The x86 objects are padded to 4 byte boundaries; on x64 it's 8 bytes (just as before)

and finally Jon Skeet responded to a question I asked of him in another question where he states (in response to the InformIT article I showed him):

It looks like the article you're referring to is inferring the overhead just from the limit, which is silly IMO.

So to answer your question, actual overhead is 24 bytes with 32 bytes of spare room, from what I gather.


One thing is for sure is that you cannot have an odd number of bytes, it is usually in multiples of the native word size which is 8bytes on a 64 bit process. So you could be adding another 7 bytes to the array.


You can actually find this limit explicitly set and verified in .net source code, and it provides some insight on why this was done (efficient implementation of advanced range check elimination in future, and backward compatibility in case of bytes):

https://github.com/dotnet/runtime/blob/596ee7cc7fef74d40223bccacdee3e1e7f21bbef/src/coreclr/vm/gchelpers.cpp

inline SIZE_T MaxArrayLength(SIZE_T componentSize)
{
    // Impose limits on maximum array length in each dimension to allow efficient
    // implementation of advanced range check elimination in future. We have to allow
    // higher limit for array of bytes (or one byte structs) for backward compatibility.
    // Keep in sync with Array.MaxArrayLength in BCL.
    return (componentSize == 1) ? 0X7FFFFFC7 : 0X7FEFFFFF;
}
...

SIZE_T componentSize = pArrayMT->GetComponentSize();
if ((SIZE_T)cElements > MaxArrayLength(componentSize))
    ThrowOutOfMemoryDimensionsExceeded();