Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to control memory allocation strategy in third party library code?

Previous header: "Must I replace global operators new and delete to change memory allocation strategy in third party code?"

Short story: We need to replace memory allocation technique in third-party library without changing its source code.

Long story:

Consider memory-bound application that makes huge dynamic allocations (perhaps, almost all available system memory). We use specialized allocators, and use them everywhere (shared_ptr's, containers etc.). We have total control and power over every single byte of memory allocated in our application.

Also, we need to link against a third-party helper library. That nasty guy makes allocations in some standard way, using default operators new, new[], delete and delete[] or malloc or something else non-standard (let's generalize and say that we don't know how this library manages it's heap allocation).

If this helper library makes allocation that are big enough we can get HDD thrashing, memory fragmentation and alignments issues, out-of-memory bad_allocs and all sorts of problems.

We can not (or do not want) to change library source code.

First attempt:

We never had such unholy "hacks" in release builds before. First test with overriding operator new works fine, except that:

  • we do not know what gotchas wait us in the future (and this is awful)
  • our users (and even our allocators) now have to allocate same way that we do

Questions:

  1. Are there ways to hook these allocations without overloading global operators? (local lib-only hooks?)
  2. ...and if we don't know what exactly it uses: malloc or new?
  3. Is this list of signatures complete? (and there are no other things that we must implement):

    void* operator new (std::size_t size) throw (std::bad_alloc); void* operator new (std::size_t size, const std::nothrow_t& nothrow_value) throw(); void* operator new (std::size_t size, void* ptr) throw(); void* operator new[] (std::size_t size) throw (std::bad_alloc); void* operator new[] (std::size_t size, const std::nothrow_t& nothrow_value) throw(); void* operator new[] (std::size_t size, void* ptr) throw();  void operator delete (void* ptr) throw(); void operator delete (void* ptr, const std::nothrow_t& nothrow_constant) throw(); void operator delete (void* ptr, void* voidptr2) throw(); void operator delete[] (void* ptr) throw(); void operator delete[] (void* ptr, const std::nothrow_t& nothrow_constant) throw(); void operator delete[] (void* ptr, void* voidptr2) throw(); 
  4. Something different if that library is dynamic?

Edit #1

Cross-platform solution is preferable if possible (looks like not very possible). If not, our major platforms:

  • Windows x86/x64 (msvc 10)
  • Linux x86/x64 (gcc 4.6)

Edit #2

Almost 2 years have passed, few OS and compiler versions have evolved, so I am curious if there is something new and unexplored in this area? Any standard proposals? OS-specifics? Hacks? How do you write memory-thirsty applications today? Please share your experience.

like image 271
Ivan Aksamentov - Drop Avatar asked May 04 '13 19:05

Ivan Aksamentov - Drop


People also ask

How is memory allocated for a code?

The “malloc” or “memory allocation” method in C is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of type void which can be cast into a pointer of any form.

How do you allocate a block of memory?

When you use dynamic memory allocation you have the operating system designate a block of memory of the appropriate size while the program is running. This is done either with the new operator or with a call to the malloc function. The block of memory is allocated and a pointer to the block is returned.

Which keyword is used for allocating block of memory in stack?

malloc() function in C It is a function which is used to allocate a block of memory dynamically. It reserves memory space of specified size and returns the null pointer pointing to the memory location.


2 Answers

Ugh, my sympathy. This is going to depend a lot on your compiler, your libc, etc. Some rubber-meets-road strategies that have "worked" to varying degrees for us in the past (/me braces for downvotes) are:

  • The operator new / operator delete overloads you suggested -- although note that some compilers are picky about not having throw() specs, some really want them, some want them for new but not for delete, etc (I have a giant platform-specific #if/#elif block for all of the 4+ platforms we're working on now).
  • Also worth noting: you can generally ignore the placement versions, they don't allocate.
  • Look at __malloc_hook and friends -- note that these are deprecated and have thread race conditions -- but they're nice in that new/delete tend to be implemented in terms of malloc (but not always).
  • Providing a replacement malloc, calloc, realloc, and free and getting your linker args in the right order so that the overrides take place (this is what gcc recommends these days, although I've had situations where it was impossible to do, and I had to use deprecated __malloc_hook) -- again, new and delete tend to be implemented in terms of these, but not always.
  • Avoiding all the standard allocation methods (operator new, malloc, etc) in "our code" and using custom functions instead -- not very easy with existing codebase.
  • Tracking down the library author and delivering a savage beating polite request or patch to change their library to allow you to specify a different allocator (it may be faster than doing this yourself) -- I think this has lead to a cardinal rule of "client always specifies the allocator or does the allocation" with any libraries I write.

Please note that this is not an answer in terms of what the standards say should happen, just my experience. I've worked with more than a few buggy/broken compilers and libc implementations in the past, so YMMV. I also have the luxury of working on fairly "sealed systems", and not being all that worried about portability for any specific application.

Regarding dynamic libraries: I'm currently in a bit of a pinch in this regard myself; our "app" gets loaded as a dynamic .so and we have to be pretty careful to pass any delete/free requests back to the default allocator if they didn't come from us. The current solution is to just cordon off our allocations to a specific area: if we get a delete/free from within that address range, we dispatch to our handler, otherwise back to the default... I've even toyed with (horrors) the idea of checking the caller address to see if it's in our address space. (The probability of going boom increases with such hacks, though.)

This may be a useful strategy even if you are the process lead and you're using an outside library: tag or restrict or otherwise identify your own allocs somehow (even going so far as to keep a list of allocs you know about), and then pass on any unknowns. All of this has ugly side-effects and limitations, though.

(Looking forward to other answers!)

like image 66
leander Avatar answered Sep 19 '22 06:09

leander


Without being able to modify the library's source code - or, better, being able to influence the author of the library to modify it - I'd say you're out of luck.

There are some things the library potentially can do (even unintentionally) to make it immune to any strategy you might employ - or, in worst cases, have the result that your usage would make the library unstable or it might make your program unstable. Such as using its own custom allocators, providing its own versions of global operator new() and operator delete(), overriding those operators in individual classes, etc.

A strategy which would probably work is to work with the library vendor and make some modifications. The modifications (from your end) would amount to being able to initialise the library by specifying allocators it uses. For the library the effort is potentially significant (having to touch all functions that dynamically allocate memory, that use standard containers, etc) but not intractable - use the supplied allocators (or sensible defaults) throughout their code.

Unfortunately, that is at odds with your requirement to not modify the library - I am skeptical of the chances of satisfying that, particularly within constraints you have outlined (memory-thirsty, hosted on windows/linux, etc).

like image 35
Peter Avatar answered Sep 17 '22 06:09

Peter