Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What improvements does GCC's `__builtin_malloc()` provide over plain `malloc()`?

Tags:

c

gcc

I have recently been made aware of GCC's built-in functions for some of the C library's memory management functions, specifically __builtin_malloc() and related built-ins (see https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html). Upon learning about __builtin_malloc(), I was wondering how it might work to provide performance improvements over the plain malloc() related library routines.

For example, if the function succeeds, it has to provide a block that can be freed by a call to plain free() since the pointer might be freed by a module that was compiled without __builtin_malloc() or __builtin_free() enabled (or am I wrong about this,and if __builtin_malloc() is used, the builtins must be globally used?). Therefore the allocated object has to be something that can be managed with the data structures that plain malloc() and free() deal with.

I can't find any details of how __builtin_malloc() works or what it does exactly (I'm not a compiler dev, so spelunking through GCC source code isn't in my wheelhouse). In some simple tests where I've tried calling __builtin_malloc() directly, it simply ends up being emitted in the object code as a call to plain malloc(). However, there might be subtlety or platform detail that I'm not providing in these simple tests.

What kinds of performance improvements can __builtin_malloc() provide over a call to plain malloc()? Does __builtin_malloc() have a dependency on the rather complex data structures that glibc's malloc() implementation use? Or conversely, does glibc's malloc()/free() have some code to deal with blocks that might be allocated by __builtin_malloc()?

Basically, how does it work?

like image 910
Michael Burr Avatar asked Sep 24 '14 05:09

Michael Burr


People also ask

Does malloc return Clear memory?

Just because malloc returns zero-initialized memory the first time doesn't mean you can count on it in general. It also could be that the memory was set to 0 by the operating system or something and malloc had nothing to do with it.

What does malloc Fill memory with?

malloc() takes a single argument (the amount of memory to allocate in bytes), while calloc() takes two arguments — the number of elements and the size of each element. malloc() only allocates memory, while calloc() allocates and sets the bytes in the allocated region to zero.

Does malloc initialize memory?

Initialization. malloc() allocates a memory block of given size (in bytes) and returns a pointer to the beginning of the block. malloc() doesn't initialize the allocated memory.

Does malloc set zero value?

malloc doesn't initialize memory to zero. It returns it to you as it is without touching the memory or changing its value.


1 Answers

I believe there is no special GCC-internal implementation of __builtin_malloc(). Rather, it exists as a builtin only so it can be optimized away under certain circumstances.

Take this example:

#include <stdlib.h> int main(void) {     int *p = malloc(4);     *p = 7;     free(p);     return 0; } 

If we disable builtins (with -fno-builtins) and look at the generated output:

$ gcc -fno-builtins -O1 -Wall -Wextra builtin_malloc.c && objdump -d -Mintel a.out  0000000000400580 <main>:   400580:   48 83 ec 08             sub    rsp,0x8   400584:   bf 04 00 00 00          mov    edi,0x4   400589:   e8 f2 fe ff ff          call   400480 <malloc@plt>   40058e:   c7 00 07 00 00 00       mov    DWORD PTR [rax],0x7   400594:   48 89 c7                mov    rdi,rax   400597:   e8 b4 fe ff ff          call   400450 <free@plt>   40059c:   b8 00 00 00 00          mov    eax,0x0   4005a1:   48 83 c4 08             add    rsp,0x8   4005a5:   c3                      ret     

Calls to malloc/free are emitted, as expected.

However, by allowing malloc to be a builtin,

$ gcc -O1 -Wall -Wextra builtin_malloc.c && objdump -d -Mintel a.out  00000000004004f0 <main>:   4004f0:   b8 00 00 00 00          mov    eax,0x0   4004f5:   c3                      ret     

All of main() was optimized away!

Essentially, by allowing malloc to be a builtin, GCC is free to eliminate calls if its result is never used, because there are no additional side-effects.


It's the same mechanism that allows "wasteful" calls to printf to be changed to calls to puts:

#include <stdio.h>  int main(void) {     printf("hello\n");     return 0; } 

Builtins disabled:

$ gcc -fno-builtin -O1 -Wall builtin_printf.c && objdump -d -Mintel a.out  0000000000400530 <main>:   400530:   48 83 ec 08             sub    rsp,0x8   400534:   bf e0 05 40 00          mov    edi,0x4005e0   400539:   b8 00 00 00 00          mov    eax,0x0   40053e:   e8 cd fe ff ff          call   400410 <printf@plt>   400543:   b8 00 00 00 00          mov    eax,0x0   400548:   48 83 c4 08             add    rsp,0x8   40054c:   c3                      ret     

Builtins enabled:

gcc -O1 -Wall builtin_printf.c && objdump -d -Mintel a.out  0000000000400530 <main>:   400530:   48 83 ec 08             sub    rsp,0x8   400534:   bf e0 05 40 00          mov    edi,0x4005e0   400539:   e8 d2 fe ff ff          call   400410 <puts@plt>   40053e:   b8 00 00 00 00          mov    eax,0x0   400543:   48 83 c4 08             add    rsp,0x8   400547:   c3                      ret     
like image 51
Jonathon Reinhart Avatar answered Sep 24 '22 22:09

Jonathon Reinhart