Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ allocating array non contiguously

Let's consider this C++ code as a rough example.

int *A = new int [5];
int *B = new int [5];
int *C = new int [5];
delete []A;
delete []C;
int *D = new int [10];

Obviously any machine can handle this case without any problems with buffer overflow or memory leak. However let's imagine that the lengths are multiplied by one million or even a bigger number. As far as I know addresses (at least virtual addresses) of all array elements are consecutive. So whenever I create an array I can be sure that they are contiguous chunks in virtual memory and I can perform pointer arithmetic to access n-th element if I have pointer to the first one. My question is illustrated in the following image( registers representing end of array are ignored for the sake of simplicity). heap array allocation

After allocating A, B, C in the heap we free A and C and get two free memory chunks of length 5 (marked with green dots). What happens when I want to allocate an array of length 10? I think that there are 3 possible cases.

  • I will get bad_alloc exception for not having contiguous 10 length memory chunk.

  • The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.

  • The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).

    Which one of these is most possible answer or is there another possible case I didn't take into account?

like image 346
samvel1024 Avatar asked Oct 30 '15 18:10

samvel1024


People also ask

Are arrays always stored contiguously in memory?

A) Array means contiguous memory. It can exist in any memory section be it Data or Stack or Heap.

Are dynamically allocated arrays contiguous?

Yes, it will be allocated contiguously.

Are arrays contiguous in C?

Array bucket values are stored in contiguous memory locations (thus pointer arithmetic can be used to iterate over the bucket values), and 2D arrays are allocated in row-major order (i.e. the memory layout is all the values in row 0 first, followed by the values in row1, followed by values in row 2 ...).

What is the difference between contiguous and non contiguous arrays?

1. Contiguous memory allocation allocates consecutive blocks of memory to a file/process. Non-Contiguous memory allocation allocates separate blocks of memory to a file/process. 2.


2 Answers

I will get bad_alloc exception for not having contiguous 10 length memory chunk.

This can happen.

The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.

This cannot happen. Moving an object to a different address is not possible in C++ because it would invalidate existing pointers.

The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).

This also cannot happen. In C++ array elements are stored contiguously, so that pointer arithmetic is possible.

But there are in fact more possibilities. To understand them, we must account for the fact that the memory can be virtual. This, among other things, means that available address space may be larger than the amount of physically present memory. A chunk of physical memory can be assigned any address from the available address space.

As an example, consider a machine with 8GB (2^33 bytes) of memory running a 64-bit os on a 64-bit CPU. Addresses allocated to the program do not gave all be less that 8GB; it can receive a megabyte chunk of memory at address 0x00000000ffff0000 and another megabyte chunk at address 0x0000ffffffff0000. The total amount of memory allocated to the program cannot be more than 2^33 bytes, but each chunk can be located anywhere in the 2^64 space. (In reality this is a bit more complicated but similar enough to what I describe).

In your picture, you have 15 little squares that represent chunks of memory. Let's say it's physical memory. Virtual memory is 15,000 little squares, of which you can use any 15 at any given time.

So, considering this fact, the following scenarios are also possible.

  • A chunk of virtual address space is given to the program that is not backed by real physical memory. When and if the program attempts to access this space, the OS will try to allocate physical memory and map it to the corresponding address so that the program can continue. If this attempt fails, the program may be killed by the OS. The newly-free memory is now available to other programs that may want it.
  • The two short chunks of memory are mapped to new virtual addresses such that they form one long contiguous chunk in the virtual memory space. Remember that typically there are many more virtual memory addresses than there is physical memory, and it is normally easy to find an unassigned range. Typically this scenario is only realized when the memory chunks in question are large.
like image 69
n. 1.8e9-where's-my-share m. Avatar answered Oct 24 '22 14:10

n. 1.8e9-where's-my-share m.


The problem that you are asking about is called heap-fragmentation, and it's a real, hard problem.

I will get bad_alloc exception for not having contiguous 10 length memory chunk.

This is the theory. But such a situation is really only possible within a 32-bit process; the 64-bit address space is vast.

That is, with a 64-bit process, it is more likely that heap fragmentation stops your new implementation from reusing some memory, which leads to an out-of-memory condition since it needs to ask the kernel for new memory for the entire D array instead of half of it. Also, such an OOM-condition will more likely cause your process to get shot by the OOM-killer sometime when you try to access a location in D, rather than new throwing an exception, because the kernel won't realize that it has overcommitted its memory before it's too late. For more information, google "memory overcommitment".

The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.

No, it can't. You are in C++, and your runtime does not know where you have possibly stored pointers to B, so it would either run the danger of missing a pointer that needs to be modified, or run the danger of modifying something that's not a pointer to B but happens to have the same bit pattern.

The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).

This is also not possible because C++ guarantees contiguous storage of arrays (to allow array accesses to be implemented via pointer arithmetic).

like image 45
cmaster - reinstate monica Avatar answered Oct 24 '22 12:10

cmaster - reinstate monica