How much of a bottleneck is memory allocation/deallocation in typical real-world programs? Answers from any type of program where performance typically matters are welcome. Are decent implementations of malloc/free/garbage collection fast enough that it's only a bottleneck in a few corner cases, or would most performance-critical software benefit significantly from trying to keep the amount of memory allocations down or having a faster malloc/free/garbage collection implementation?
Note: I'm not talking about real-time stuff here. By performance-critical, I mean stuff where throughput matters, but latency doesn't necessarily.
Edit: Although I mention malloc, this question is not intended to be C/C++ specific.
Deallocation of memory by the Operating System (OS) is a way to free the Random Access Memory (RAM) of finished processes and allocate new ones. We all know that the computer memory comes with a specific size.
Only deallocate memory when you are truly finished with that memory. If you have more than one pointer to a chunk of memory, then deallocating that chunk makes all pointers to it stale; they are pointing to memory that you are no longer allowed to use. Those pointers are called dangling pointers.
Objects allocated with new stay on the heap until they are deallocated with delete . If you don't delete them, they will remain there (but won't be accessible -> this is called memory leak) until your process exits, when the operating system will deallocate them.
When a class object is composed of other class objects, the destructor of the container class object is invoked first. After this the destructors of the component objects are invoked in the opposite order in which they were constructed.
It's significant, especially as fragmentation grows and the allocator has to hunt harder across larger heaps for the contiguous regions you request. Most performance-sensitive applications typically write their own fixed-size block allocators (eg, they ask the OS for memory 16MB at a time and then parcel it out in fixed blocks of 4kb, 16kb, etc) to avoid this issue.
In games I've seen calls to malloc()/free() consume as much as 15% of the CPU (in poorly written products), or with carefully written and optimized block allocators, as little as 5%. Given that a game has to have a consistent throughput of sixty hertz, having it stall for 500ms while a garbage collector runs occasionally isn't practical.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With