Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is memory allocation on heap MUCH slower than on stack?

I have been told this many times. But I don't know WHY...What extra cost is involved when allocating memory from heap? Is it hardware related? Is it related to CPU cycles? So many guesses but no exact answers...Could someone give me some elaboration?

Just as "unwind" said, the Heap data structure is more complicated than Stack. And In my opinion, some memory space is allocated to a thread as its Stack when it starts to run, while the heap is shared by all the threads within a process. This paradigm require some extra mechanism to manage each thread's usage of the shared heap, such as Garbage Collection. Am I right on this?

like image 626
smwikipedia Avatar asked Feb 15 '10 09:02

smwikipedia


People also ask

Why stack memory is faster than heap?

Because the data is added and removed in a last-in-first-out manner, stack-based memory allocation is very simple and typically much faster than heap-based memory allocation (also known as dynamic memory allocation) typically allocated via malloc .

Is the heap slower than the stack?

In conclusion, Stack is faster than Heap only because of the Stack Pointer.

Which is faster stack allocation or heap allocation?

Stack memory allocation is considered safer as compared to heap memory allocation because the data stored can only be access by owner thread. Memory allocation and de-allocation is faster as compared to Heap-memory allocation. Stack-memory has less storage space as compared to Heap-memory.

Why stack is faster than heap Swift?

Variables allocated on the stack are stored directly into memory and access memory very faster and its allocation dealt with compile time. Variable allocated on the heap have their memory allocated at run time and accessing their memory is bit slower.


2 Answers

Because the heap is a far more complicated data structure than the stack.

For many architectures, allocating memory on the stack is just a matter of changing the stack pointer, i.e. it's one instruction. Allocating memory on the heap involves looking for a big enough block, splitting it, and managing the "book-keeping" that allows things like free() in a different order.

Memory allocated on the stack is guaranteed to be deallocated when the scope (typically the function) exits, and it's not possible to deallocate just some of it.

like image 81
unwind Avatar answered Oct 31 '22 05:10

unwind


In your edit where you restate unwind's answer you mention the "heap data structure". Be very careful as the data structure known as a heap has no relationship to dynamic memory allocation. To be very clear, I'll use the more language lawyer terminology of free store.

As has already been pointed out, stack allocation requires incrementing a pointer, which typically has a dedicated register on most architectures and deallocation requires the same amount of work. Stack allocations are also scoped to a particular function. This makes them much better candidates for compiler optimizations like precomputing the total space needed on the stack and doing a single increment for an entire stack frame. Likewise, the stack has better guaranteed data locality. The top of the stack is almost always guaranteed to be inside a cache line, and as I mentioned already the stack pointer is typically stored in a register. Optimizing compilers on some architectures can even eliminate allocations altogether on the stack by reusing arguments from previous stack frames that are passed as arguments to called functions in deeper stack frames. Likewise stack allocated variables can often be promoted to registers avoiding allocations as well.

In contrast, the free store is much more complex. I'm not even going to begin to cover garbage collection systems as that's a whole different topic, and this question was asked about the C language. Typically allocations and deallocations from a free store involve several different data structures like a free list or block pool. These data structures and bookkeeping require memory as well, and thus that space is wasted. Furthermore, the bookkeeping records are often intermixed with the allocations and thus hurt the data locality of other allocations. Allocations from the free store may involve asking the underlying operating system for more process memory typically from some form of slab allocator.

For a simple comparison, and using jemalloc-2.2.5 and numbers from sloccount as a reference, the jemalloc implementation contains over 8,800 lines of source code in the C language and another over 700 lines of test code. This should give you a good idea of the difference in complexity between free store allocation and stack allocation: thousands of lines of C code versus a single instruction.

Additionally, since free store allocations are not limited to a single lexical scope, the lifetime of every allocation must be tracked. Likewise, these allocations may be passed across threads, and thus thread synchronization issues enter the problem space. Another big problem for free store allocation is fragmentation. Fragmentation causes many problems:

  • Fragmentation hurts data locality.
  • Fragmentation wastes memory.
  • Fragmentation makes the job of finding free space for large allocations harder.

On modern systems, stacks are often relatively small in comparison to the free store, so ultimately the free store is managing more space and thus tackling a harder problem. Also, due to the limitations on stack sizes, the free store is typically used for larger allocations, this discrepancy between having to handle both very large and very small allocations makes the job of the free store harder as well. Typically stack allocations are small on the order of a few kilobytes or less, and the total size of the stack is only a few megabytes. The free store is generally given the entire rest of process space in a program. On modern machines this can be several hundred gigabytes, and it's not uncommon for free store allocations to vary in size from a few bytes like a short string of characters to megabytes or even gigabytes of arbitrary data. This means that free store allocators have to deal with the underlying operating system's virtual memory management. Stack allocation is essentially built-in to the computer hardware.

If you want to really learn about free store allocation, I would highly recommend reading some of the many papers and articles published about various malloc implementations or even reading the code. Here's a few links to get you started:

  • dlmalloc - Doug Lea's malloc, a historical reference malloc implementation used in GNU C++ at one point in time
  • phkmalloc - FreeBSD implementation of malloc written by Poul-Henning Kamp author of the Varnish web cache
  • tcmalloc - Thread-Caching Malloc implemented by some Google developers
  • jemalloc - Jason Evan's malloc implementation for FreeBSD (successor of phkmalloc)

Here's some additional links with descriptions of the tcmalloc implementation:

  • http://jamesgolick.com/2013/5/15/memory-allocators-101.html
  • http://jamesgolick.com/2013/5/19/how-tcmalloc-works.html
like image 25
b4hand Avatar answered Oct 31 '22 04:10

b4hand