Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimize GCC compilation for memory usage?

Tags:

c

gcc

I am developing a library which should use as little memory as possible (I am not concerned about anything else, like the binary size, or speed optimizations).

Are there any GCC flags (or any other GCC-related options) I can use? Should I avoid some level of -O* optimization?

like image 976
syntagma Avatar asked Dec 09 '22 03:12

syntagma


2 Answers

You library -or any code in idiomatic C- has several kinds of memory usage :

  • binary code size, and indeed -Os should optimize that
  • heap memory, using C dynamic allocation, that is malloc; you obviously should know how, and how much, heap memory is allocated (and later free-d). The actual memory consumption would depend upon your particular malloc implementation (e.g. many implementations, when calling malloc(25) could in fact consume 32 bytes), not on the compiler. BTW, you might design your library to use some memory pools or even implement your own allocator (above OS syscalls like mmap, or above malloc etc...)
  • local variables, that is the call frames on the call stack. This mostly depend upon your code (but an optimizing compiler, e.g. -Os or -O2 for gcc, would probably use more registers and perhaps slightly less stack when optimizing). You could pass -fstack-usage to gcc to ask it to give the size of every call frame and you might give -Wstack-usage=len to be warned when a call frame exceeds len bytes.
  • global or static variables. You should know how much memory they need (and you might use nm or some other binutils program to query them). BTW, declaring carefully some variables inside a function as static would lower the stack consumption (but you cannot do that for every variable or every function).

Notice also that in some limited cases, GCC is doing tail calls, and then the stack usage is lowered (since the stack frame of the caller is reused in the callee). (See also this old question).

You might also ask the compiler to pack some particular struct-s (beware, this could slowdown the performance significantly). You'll want to use some type attributes like __attribute__((packed)), etc... and perhaps also some variable attributes etc...

Perhaps you should read more about Garbage Collection, since GC techniques, concepts, and terminology might be relevant. See this answer.

If on Linux, the valgrind tool should be useful too... (and during the debugging phase the -fsanitize=address option of recent GCC).

You might perhaps also use some code generation options like -fstack-reuse= or -fshort-enums or -fpack-struct or -fstack-limit-symbol= or -fsplit-stack ; be very careful: some such options make your binary code incompatible with your existing C (and others!) libraries (then you might need to recompile all used libraries, including your libc, with the same code generation flags).

You probably should enable link-time optimizations by compiling and linking with -flto (in addition of other optimization flags like -Os).

You certainly should use a recent version of GCC. Notice that GCC 5.1 has been released a few days ago (in april 2015).

If your library is large enough to worth the effort, you might even consider customizing your GCC compiler with MELT (to help you find out how to spend less memory). This might take weeks or months of work.

like image 167
Basile Starynkevitch Avatar answered Dec 11 '22 09:12

Basile Starynkevitch


there are advantages to using 'stack frames', but that does use more stack space to save the stack frame pointer.

You can tell the compiler to not use stack frames. This will (generally) slightly increase the code size but will reduce the amount of stack used.

you can only use char and short for values rather than int.

It is poor programing practice, but can re-use variables and arrays for multiple purposes.

if some set of variables are mutually exclusive on usage, then can place them in a union.

If the function parameter lists are all very short, then can for the compiler to pass all the parameters in registers. (having an architecture with lots of general purpose registers really helps here.

Only use one malloc that contains ALL the area needed for malloc kind of operations, so as to minimize the amount of allocated memory overhead.

there are many techniques. Most make the code much more difficult to debug/maintain and often make the code much harder for humans to read

like image 24
user3629249 Avatar answered Dec 11 '22 10:12

user3629249