Does Malloc only use the heap if requested memory space is large?

Tags:

Whenever you study the memory allocation of processes you usually see it outlined like this:

enter image description here

So far so good.

But then you have the sbrk() system call which allows the program to change the upper limit of its data section, and it can also be used to simply check where that limit is with sbrk(0). Using that function I found the following patterns:

Pattern 1 - Small malloc

I run the following program on my Linux machine:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int globalVar;

int main(){
        int localVar;
        int *ptr;

        printf("localVar address (i.e., stack) = %p\n",&localVar);
        printf("globalVar address (i.e., data section) = %p\n",&globalVar);
        printf("Limit of data section = %p\n",sbrk(0));

        ptr = malloc(sizeof(int)*1000);

        printf("ptr address (should be on stack)= %p\n",&ptr);
        printf("ptr points to: %p\n",ptr);
        printf("Limit of data section after malloc= %p\n",sbrk(0));

        return 0;
}

And the output is the following:

localVar address (i.e., stack) = 0xbfe34058
globalVar address (i.e., data section) = 0x804a024
Limit of data section = 0x91d9000
ptr address (should be on stack)= 0xbfe3405c
ptr points to: 0x91d9008
Limit of data section after malloc= 0x91fa000

As you can see the allocated memory region was right above the old data section limit, and after the malloc that limit was pushed upward, so the allocated region is actually inside the new data section.

Question 1: Does this mean that small mallocs will allocate memory in the data section and not use the heap at all?

Pattern 2 - Big Malloc

If you increase the requested memory size on line 15:

ptr = malloc(sizeof(int)*100000);

you will now the following output:

localVar address (i.e., stack) = 0xbf93ba68
globalVar address (i.e., data section) = 0x804a024
Limit of data section = 0x8b16000
ptr address (should be on stack)= 0xbf93ba6c
ptr points to: 0xb750b008
Limit of data section after malloc= 0x8b16000

As you can see here the limit of the data section has not changed, and instead the allocated memory region is in the middle of the gap section, between the data section and the stack.

Question 2: Is this the large malloc actually using the heap?

Question 3: Any explanation for this behavior? I find it a bit insecure, cause on the first example (small malloc) even after you free the allocated memory you'll still be able to use the pointer and use that memory without getting a seg fault, as it will be inside your data section, and this could lead to hard to detect bugs.

Update with Specs: Ubuntu 12.04, 32-bits, gcc version 4.6.3, Linux kernel 3.2.0-54-generic-pae.

Update 2: Rodrigo's answer below solved this mystery. This Wikipedia link also helped.

288

asked Oct 10 '13 19:10

Daniel Scocco

1 Answers

First of all, the only way to be absolutely sure of what happens is to read the source code of malloc. Or even better, step through it with the debugger.

But anyway, here are my understanding of these things:

The system call sbrk() is used to increase the size of the data section, all right. Usually, you will not call it directly, but it will be called by the implementation of malloc() to increase the memory available for the heap.
The function malloc() does not allocate memory from the OS. It just splits the data section in pieces and assigns these pieces to whoever need them. You use free() to mark one piece as unused and available for reassignment.
Point 2 is an oversimplification. At least the GCC implementation, for big blocks, malloc() allocates them using mmap() with private, non-file backed options. Thus, these blocks are outside of the data segment. Obviously, calling free() in such a block will call munmap().

What is exactly a big block depends on many details. See man mallopt for the gory details.

From that, you can guess what happens when you access to free'd memory:

If the block was small, the memory will still be there, so if you read nothing will happen. If you write to it, you may corrupt the internal heap structures, or it may have been reused and you can corrupt any random structure.
If the block was big, the memory has been unmapped, so any access will result in a segmentation fault. Unless the improbable situation that in the interim, another big block is allocated (or another thread calls mmap() and the same address range happen to be used.

Clarification

The term data section is used with two different meanings, depending on the context.

The .data section of the executable (linker point of view). It may also include .bss or even .rdata. For the OS that means nothing, it just maps pieces of the program into memory with little regard of what it contains other than the flags (read-only, executable...).
The heap, that block of memory that every process has, that is not read from the executable, and that can be grown using sbrk().

You can see that with the following command that prints the memory layout of a simple program (cat):

$ cat /proc/self/maps
08048000-08053000 r-xp 00000000 00:0f 1821106    /usr/bin/cat
08053000-08054000 r--p 0000a000 00:0f 1821106    /usr/bin/cat
08054000-08055000 rw-p 0000b000 00:0f 1821106    /usr/bin/cat
09152000-09173000 rw-p 00000000 00:00 0          [heap]
b73df000-b75a5000 r--p 00000000 00:0f 2241249    /usr/lib/locale/locale-archive
b75a5000-b75a6000 rw-p 00000000 00:00 0 
b75a6000-b774f000 r-xp 00000000 00:0f 2240939    /usr/lib/libc-2.18.so
b774f000-b7750000 ---p 001a9000 00:0f 2240939    /usr/lib/libc-2.18.so
b7750000-b7752000 r--p 001a9000 00:0f 2240939    /usr/lib/libc-2.18.so
b7752000-b7753000 rw-p 001ab000 00:0f 2240939    /usr/lib/libc-2.18.so
b7753000-b7756000 rw-p 00000000 00:00 0 
b7781000-b7782000 rw-p 00000000 00:00 0 
b7782000-b7783000 r-xp 00000000 00:00 0          [vdso]
b7783000-b77a3000 r-xp 00000000 00:0f 2240927    /usr/lib/ld-2.18.so
b77a3000-b77a4000 r--p 0001f000 00:0f 2240927    /usr/lib/ld-2.18.so
b77a4000-b77a5000 rw-p 00020000 00:0f 2240927    /usr/lib/ld-2.18.so
bfba0000-bfbc1000 rw-p 00000000 00:00 0          [stack]

The first line is the executable code (.text section).

The second line is the read-only data (.rdata section) and some other read-only sections.

The third line is the .data + .bss and some other writable sections.

The fourth line is the heap!

The next lines, those with a name are memory mapped files or shared objects. Those without a name are probably big malloc'ed blocks of memory (or maybe private anonymous mmap's, they are impossible to distinguish).

The last line is the stack!

130

answered Oct 07 '22 12:10

rodrigo

Related questions
                            
                                Understanding restrict qualifier by examples
                            
                                Exclusively open a device file in Linux
                            
                                is `warning C4127` (conditional expression is constant) ever helpful?
                            
                                Emacs + C/C++ + Doxygen: Alternative to doxymacs? With yasnippet?
                            
                                Efficient layout and reduction of virtual 2d data (abstract)
                            
                                sprintf invalid format '%d'
                            
                                Once again: strict aliasing rule and char*
                            
                                Is pointer conversion through a void pointer well defined?
                            
                                C/C++ Linux GDB API [closed]
                            
                                What is the type of a pointer to a variable-length array in C?
                            
                                Programming for Young tableaux
                            
                                Force order of execution of C statements?
                            
                                Image scaling (KeepAspectRatioByExpanding) through OpenGL
                            
                                Compiling C programs using libssl on OS X El Capitan?
                            
                                Why do compilers insist on using a callee-saved register here?
                            
                                are C functions declared in <c____> headers guaranteed to be in the global namespace as well as std?
                            
                                How to read & understand C & C++ Standards and the language grammar used therein?
                            
                                Performance of array of functions over if and switch statements
                            
                                mmap with /dev/zero
                            
                                Can I write bytes directly to video memory under Linux, or is there a better way to get data onto the screen?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does Malloc only use the heap if requested memory space is large?

Tags:

c

memory-management

heap-memory

pointers

malloc

Daniel Scocco

People also ask

1 Answers

rodrigo

Recent Activity

Donate For Us