Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic memory allocation in STD

Working a lot with microcontrollers and C++ it is important for me to know that I do not perform dynamic memory allocations. However I would like to get the most out of the STD lib. What would be the best strategy to determine if a function/class from STD uses dynamic memory allocation?

So far I come up with these options:

  1. Read and understand the STD code. This is of course possible but lets be honest, it is not the easiest code to read and there is a lot of it.
  2. A variation on reading the code could be to have a script search for memory allocation and highlight those parts to it make it easier to read. This still would require figuring out where functions allocating memory are used, and so forts.
  3. Just testing what I would like to use and watch the memory with the debugger. So far I have been using this method but this is a reactive approach. I would like to know before hand when designing code what I can use from STD. Also what is there to say that there are some (edge) cases where memory is allocated. Those might not show up in this limited test.
  4. Finally what could be done is regularly scan the generated assembler code for memory allocations. I suspect this could be scripted and included in the toolchain but again this is a reactive method.

If you see any other options or have experience doing something similar, please let me know.

p.s. I work mainly with ARM Cortex-Mx chips at this moment compiling with GCC.

like image 355
Bart Avatar asked Aug 02 '21 11:08

Bart


People also ask

Does STD Vector use dynamic memory?

As mentioned above, std::vector is a templated class that represents dynamic arrays. std::vector typically allocates memory on the heap (unless you override this behavior with your own allocator).

What is dynamic memory allocation explain in detail?

Allocation of memory at the time of execution (run time) is known as dynamic memory allocation. The functions calloc() and malloc() support allocating of dynamic memory. Dynamic allocation of memory space is done by using these functions when value is returned by functions and assigned to pointer variables.

Is std :: string dynamically allocated?

Inside every std::string is a dynamically allocated array of char .

Does std :: function allocate memory?

std::function can store objects of arbitrary size, this means it must perform dynamic memory allocation in some cases. there are certain types for which std::function is guaranteed not to throw exceptions. This implies that there are certain types it must store without dynamic memory allocation.


2 Answers

You have some very good suggestions in the comments, but no actual answers, so I will attempt an answer.

In essence you are implying some difference between C and C++ that does not really exist. How do you know that stdlib functions don't allocate memory?

Some STL functions are allowed to allocate memory and they are supposed to use allocators. For example, vectors take an template parameter for an alternative allocator (for example pool allocators are common). There is even a standard function for discovering if a type uses memory

But... some types like std::function sometimes use memory allocation and sometimes do not, depending on the size of the parameter types, so your paranoia is not entirely unjustified.

C++ allocates via new/delete. New/Delete allocate via malloc/free.

So the real question is, can you override malloc/free? The answer is yes, see this answer https://stackoverflow.com/a/12173140/440558. This way you can track all allocations, and catch your error at run-time, which is not bad.

You can go better, if you are really hardcore. You can edit the standard "runtime C library" to rename malloc/free to something else. This is possible with "objcopy" which is part of the gcc tool chain. After renaming the malloc/free, to say ma11oc/fr33, any call to allocate/free memory will no longer link. Link your executable with "-nostdlib" and "-nodefaultlibs" options to gcc, and instead link your own set of libs, which you generated with objcopy.

To be honest, I've only seen this done successfully once, and by a programmer you did not trust objcopy, so he just manually found the labels "malloc" "free" using a binary editor, and changed them. It definitely works though.

Edit: As pointed out by Fureeish (see comments), it is not guaranteed by the C++ standard that new/delete use the C allocator functions. It is however, a very common implementation, and your question does specifically mention GCC. In 30 years of development, I have never seen a C++ program that runs two heaps (one for C, and one for C++) just because the standard allows for it. There would simply be no advantage in it. That doesn't preclude the possibility that there may be an advantage in the future though.
Just to be clear, my answer assumes new USES malloc to allocate memory. This doesn't mean you can assume that every new call calls malloc though, as there may be caching involved, and the operator new may be overloaded to use anything at all at the global level. See here for GCC/C++ allocator schemes.

https://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html

Yet another edit:
If you want to get technical - it depends on the version of libstdc++ you are using. You can find operator new in new_op.cc, in the (what I assume is the official) source repository

(I will stop now)

like image 121
Tiger4Hire Avatar answered Oct 20 '22 15:10

Tiger4Hire


The options you listed are pretty comprehensive, I think I would just add some practical color to a couple of them.

Option 1: if you have the source code for the specific standard library implementation you're using, you can "simplify" the process of reading it by generating a static call graph and reading that instead. In fact the llvm opt tool can do this for you, as demonstrated in this question. If you were to do this, in theory you could just look at a given method and see if goes to an allocation function of any kind. No source code reading required, purely visual.

Option 4: scripting this is easier than you think. Prerequisites: make sure you're building with -ffunction-sections, which allows the linker to completely discard functions which are never called. When you generate a release build, you can simply use nm and grep on the ELF file to see if for example malloc appears in the binary at all.

For example I have a bare metal cortex-M based embedded system which I know for a fact has no dynamic memory allocation, but links against a common standard library implementation. On the debug build I can do the following:

$ nm Debug/Project.axf | grep malloc
700172bc T malloc
$

Here malloc is found because dead code has not been stripped.

On the release build it looks like this:

$ nm Release/Project.axf | grep malloc
$

grep here will return "0" if a match was found and something other than "0" if it wasn't, so if you were to use this in a script it would be something like:

nm Debug/Project.axf | grep malloc > /dev/null
if [ "$?" == "0" ]; then
    echo "error: something called malloc"
    exit 1
fi

There's a mountain of disclaimers and caveats that come with any of these approaches. Keep in mind that embedded systems in particular use a wide variety of different standard library implementations, and each implementation is free to do pretty much whatever it wants with regard to memory management.

In fact they don't even have to call malloc and free, they could implement their own dynamic allocators. Granted this is somewhat unlikely, but it is possible, and thus grepping for malloc isn't actually sufficient unless you know for a fact that all memory management in your standard library implementation goes through malloc and free.

If you're serious about avoiding all forms of dynamic memory allocation, the only sure way I know of (and have used myself) is simply to remove the heap entirely. On most bare metal embedded systems I've worked with, the heap start address, end address, and size are almost always provided a symbols in the linker script. You should remove or rename these symbols. If anything is using the heap, you'll get a linker error, which is what you want.

To give a very concrete example, newlib is a very common libc implementation for embedded systems. Its malloc implementation requires that the common sbrk() function be present in the system. For bare metal systems, sbrk() is just implemented by incrementing a pointer that starts at the end symbol provided by the linker script.

If you were using newlib, and you didn't want to mess with the linker script, you could still replace sbrk() with a function that simply hard faults so you catch any attempt to allocate memory immediately. This in my opinion would still be much better than trying to stare at heap pointers on a running system.

Of course your actual system may be different, and you may have a different libc implementation that you're using. This question can really only answered to any reasonable satisfaction in the exact context of your system, so you'll probably have to do some of your own homework. Chances are it's pretty similar to what I've described here.

One of the great things about bare metal embedded systems is the amount of flexibility that they provide. Unfortunately this also means there are so many variables that it's almost impossible to answer questions directly unless you know all of the details, which we don't here. Hopefully this will give you a better starting point than staring at a debugger window.

like image 33
Jon Reeves Avatar answered Oct 20 '22 16:10

Jon Reeves