Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to log mallocs

This is a bit hypothetical and grossly simplified but...

Assume a program that will be calling functions written by third parties. These parties can be assumed to be non-hostile but can't be assumed to be "competent". Each function will take some arguments, have side effects and return a value. They have no state while they are not running.

The objective is to ensure they can't cause memory leaks by logging all mallocs (and the like) and then freeing everything after the function exits.

Is this possible? Is this practical?

p.s. The important part to me is ensuring that no allocations persist so ways to remove memory leaks without doing that are not useful to me.

like image 740
BCS Avatar asked Dec 06 '22 07:12

BCS


2 Answers

You don't specify the operating system or environment, this answer assumes Linux, glibc, and C.

You can set __malloc_hook, __free_hook, and __realloc_hook to point to functions which will be called from malloc(), realloc(), and free() respectively. There is a __malloc_hook manpage showing the prototypes. You can add track allocations in these hooks, then return to let glibc handle the memory allocation/deallocation.

It sounds like you want to free any live allocations when the third-party function returns. There are ways to have gcc automatically insert calls at every function entrance and exit using -finstrument-functions, but I think that would be inelegant for what you are trying to do. Can you have your own code call a function in your memory-tracking library after calling one of these third-party functions? You could then check if there are any allocations which the third-party function did not already free.

like image 50
DGentry Avatar answered Jan 17 '23 08:01

DGentry


First, you have to provide the entrypoints for malloc() and free() and friends. Because this code is compiled already (right?) you can't depend on #define to redirect.

Then you can implement these in the obvious way and log that they came from a certain module by linking those routines to those modules.

The fastest way involves no logging at all. If the amount of memory they use is bounded, why not pre-allocate all the "heap" they'll ever need and write an allocator out of that? Then when it's done, free the entire "heap" and you're done! You could extend this idea to multiple heaps if it's more complex that that.

If you really do need to "log" and not make your own allocator, here's some ideas. One, use a hash table with pointers and internal chaining. Another would be to allocate extra space in front of every block and put your own structure there containing, say, an index into your "log table," then keep a free-list of log table entries (as a stack so getting a free one or putting a free one back is O(1)). This takes more memory but should be fast.

Is it practical? I think it is, so long as the speed-hit is acceptable.

like image 22
Jason Cohen Avatar answered Jan 17 '23 07:01

Jason Cohen