Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multithreaded Memory Allocators for C/C++

I currently have heavily multi-threaded server application, and I'm shopping around for a good multi-threaded memory allocator.

So far I'm torn between:

  • Sun's umem
  • Google's tcmalloc
  • Intel's threading building blocks allocator
  • Emery Berger's hoard

From what I've found hoard might be the fastest, but I hadn't heard of it before today, so I'm skeptical if its really as good as it seems. Anyone have personal experience trying out these allocators?

like image 307
Robert Gould Avatar asked Sep 29 '08 02:09

Robert Gould


2 Answers

I've used tcmalloc and read about Hoard. Both have similar implementations and both achieve roughly linear performance scaling with respect to the number of threads/CPUs (according to the graphs on their respective sites).

So: if performance is really that incredibly crucial, then do performance/load testing. Otherwise, just roll a dice and pick one of the listed (weighted by ease of use on your target platform).

And from trshiv's link, it looks like Hoard, tcmalloc, and ptmalloc are all roughly comparable for speed. Overall, tt looks like ptmalloc is optimized for taking as little room as possible, Hoard is optimized for a trade-off of speed + memory usage, and tcmalloc is optimized for pure speed.

like image 180
hazzen Avatar answered Sep 19 '22 22:09

hazzen


The only way to really tell which memory allocator is right for your application is to try a few out. All of the allocators mentioned were written by smart folks and will beat the others on one particular microbenchmark or another. If all your application does all day long is malloc one 8 byte chunk in thread A and free it in thread B, and doesn't need to handle anything else at all, you could probably write a memory allocator that beats the pants off any of those listed so far. It just won't be very useful for much else. :)

I have some experience using Hoard where I work (enough so that one of the more obscure bugs addressed in the recent 3.8 release was found as a result of that experience). It's a very good allocator - but how good, for you, depends on your workload. And you do have to pay for Hoard (though it's not too expensive) in order to use it in a commercial project without GPL'ing your code.

A very slightly adapted ptmalloc2 has been the allocator behind glibc's malloc for quite a while now, and so it's incredibly widely used and tested. If stability is important above all things, it might be a good choice, but you didn't mention it in your list, so I'll assume it's out. For certain workloads, it's terrible - but the same is true of any general purpose malloc.

If you're willing to pay for it (and the price is reasonable, in my experience), SmartHeap SMP is also a good choice. Most of the other allocators mentioned are designed as drop-in malloc/free new/delete replacements that can be LD_PRELOAD'd. SmartHeap can be used that way as well, but it also includes an entire allocation-related API that lets you fine-tune your allocators to your heart's content. In tests that we've done (again, very specific to a particular application), SmartHeap was about the same as Hoard for performance when acting as a drop-in malloc replacement; the real difference between the two is the degree of customization. You can get better performance the less general-purpose you need your allocator to be.

And depending on your use case, a general-purpose multithreaded allocator might not be what you want to use at all; if you're constantly malloc & free'ing objects that are all the same size, you might want to just write a simple slab allocator. Slab allocation is used in several places in the Linux kernel that fit that description. (I would give you a couple more useful links, but I'm a "new user" and Stack Overflow has decided that new users are not allowed to be too helpful all in one answer. Google can help out well enough, though.)

like image 30
strangelydim Avatar answered Sep 22 '22 22:09

strangelydim