Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the "correct" way to reconcile malloc and new in a mixed C/C++ program?

I have a mixed C/C++ program. It contains a flex/bison parser which targets C, while the remainder is C++.

Being C, the generated parser and scanner manage their memory with malloc, realloc and free. They are good enough to expose hooks allowing me to submit my own implementations of these functions. As you might expect, the rest of the (C++) program "wants" to use new, delete, etc.

Doing a little research seems to show that the relevant standards do not guarantee that such mixing should work. Particularly the C "heap" is not necessarily the C++ "free area". It seems the two schemes can trample each other.

On top of this, someday (soon) this program will probably want to integrate a customized heap implementation such as tcmalloc, used by both C and C++.

What is the "right" thing to do here?

Given the desire to integrate tcmalloc (which explains how to link with C programs) I'm tempted to find some cross-type, cross-thread, cross-everything overload/hook/whatever into C++ memory management. With that I could point all C++ allocation/release calls back to their C equivalents (which in turn land on tcmalloc.)

Does such a pan-galactic global C++ hook exist? Might it already be doing what I want, similar to how ios_base::sync_with_stdio secretly marries iostream and stdio by default?

I am not interested in talking about stdio vs. iostreams, nor about switching parser generators nor using the C++ flex/bison skeletons (they introduce independent headaches.)

EDIT: Please include the names of those sections of the C++ standard that support your answer.

like image 830
phs Avatar asked Dec 02 '12 21:12

phs


People also ask

Is malloc faster than new?

So, malloc is faster on average, but there's enough variation in speed (in both new and malloc ) that an individual invocation of new might actually be faster than an individual invocation of malloc .

What does malloc do in C?

What is malloc() in C? malloc() is a library function that allows C to allocate memory dynamically from the heap. The heap is an area of memory where something is stored. malloc() is part of stdlib. h and to be able to use it you need to use #include <stdlib.

Why use malloc over calloc?

Use malloc() if you are going to set everything that you use in the allocated space. Use calloc() if you're going to leave parts of the data uninitialized - and it would be beneficial to have the unset parts zeroed.


2 Answers

The standard does guarantee that mixing the two allocation variants will work. What it doesn't permit is things like calling free on memory that came from new, since they may use a totally different arena for the two types.

Providing you remember to call the correct deallocation function for a given block of memory, you will be fine. They won't trample each other if you follow the rules and, if you don't follow the rules then, technically, you're doing the trampling, not them :-)


The controlling part of the C++11 standard is 20.6.13 C library which states, paraphrased:

  • The functions calloc, malloc, free and realloc are provided, based on the C standard.
  • The functions do not use ::operator new() or ::operator delete().
  • This allows the heritage C stuff to use a different memory arena then the normal C++ memory allocation.

That second bullet point is interesting in light of what you're eventually proposing, dropping in tcmalloc to replace the C heritage functions and have C++ use it as well.

There's a footnote in the standard which explains why they don't use let malloc() call ::operator new():

The intent is to have operator new() implementable by calling std::malloc() or std::calloc(). In other words, they want to avoid a circular dependency.

However, while it allows operator new() to call malloc(), I'm not sure that the standard actually requires it. So, to be safe, you'd probably want to inject tcmalloc into both the C and C++ areas.

You've indicated you already know how to do that for C. For C++, it can be done by simply providing the entire set of global operator new()/delete() functions in your code, suitably written to call tcmalloc under the covers. The C++ standard states in 3.7.4 Dynamic storage duration:

The library provides default definitions for the global allocation and deallocation functions. Some global allocation and deallocation functions are replaceable.

A C++ program shall provide at most one definition of a replaceable allocation or deallocation function. Any such function definition replaces the default version provided in the library.

The following allocation and deallocation functions are implicitly declared in global scope in each translation unit of a program:

  • void* operator new(std::size_t);
  • void* operator new[](std::size_t);
  • void operator delete(void*);
  • void operator delete[](void*);
like image 97
paxdiablo Avatar answered Oct 15 '22 12:10

paxdiablo


Ok. Dug up an old working draft of the standard (2/28/2011 rev 3242.) It appears the relevant sections are 3.7.4 Dynamic storage duration and 18.6.1 Storage allocation and deallocation.

In short it seems the pan-galactic hook I wanted are the global new and delete operators themselves. If one respects some semantics (in 3.7.4.1 and 3.7.4.2: basically delegate to new_handler as needed) one is allowed to replace

void* operator new(std::size_t);
void* operator new[](std::size_t);
void operator delete(void*);
void operator delete[](void*);

to arrest default memory management of the entire C++ program. I still can't find the section that proves @paxdiablo right, but I'm willing to run with it for now.

like image 31
phs Avatar answered Oct 15 '22 12:10

phs