How do memory leak detectors actually work? What are the underlying concepts in general? Can take C++ as the language to explain this.

Here's a published technical paper on how our CheckPointer tool works. Fundamentally it tracks the lifetimes of all values (heap and stack), and their sizes according their types as defined by the language. This allows CheckPointer to find not only leaks, but out-of-array bound accesses, even for arrays in the stack, which valgrind won't do. In particular, it analyzes the source code to find all pointer uses. (This is quite the task just by itself). It keeps track of pointer meta data for each pointer, consisting of <ul> <li>A reference to the object meta data for the heap-allocated object or global or local variable orfunction pointed to by the pointer and</li> <li>The address range of the (sub)object of the object that the pointer may currently access. This may be smaller than the address range of the whole object; e.g. if you take the address of a struct member, the instrumented source code will only allow access to that member when using the resulting pointer.</li> </ul> It also tracks the kind and location of each object, i.e. whether it is a function, a global, thread-local or local variable, heap-allocated memory, or a string literal constant: <ul> <li>The address range of the object that may be safely accessed, and </li> <li>For each pointer stored in the heap-allocated object or variable, a reference to the pointer metadata for that pointer.</li> </ul> All this tracking is accomplished by transforming the original program source, into a program which does what the original program does, and interleaves various meta-data checking or updating routines. The resulting program is compiled and run. Where a meta-data check fails at runtime, a backtrace is provided with a report of the type of failure (invalid pointer, pointer outside valid bounds, ...)

Memory Leak Detectors Working Principle

2 Answers

There are a couple of different ways that leak detectors work. You can replace the implementation of malloc and free with ones that can track more information during allocation and are not concerned with performance. This is similar to how dmalloc works. In general, any address that is malloc'ed but not free'd is leaked.

In practice, you'll probably want more than just the single line number, but rather a stack trace for the lost allocations.

Another approach is how valgrind works which implements an entire virtual machine to keep track of addresses and memory references and associated bookkeeping. The valgrind approach is much more expensive, but also much more effective as it can also tell you about other types of memory errors like out of bounds reads or writes.

Valgrind essentially instruments the underlying instructions and can track when a given memory address has no more references. It can do this by tracking assignments of addresses, and so it can tell you not just that a piece of memory was lost, but exactly when it became lost.

C++ makes things a little harder for both types of leak detectors because it adds the new and delete operators. Technically new can be a completely different source of memory than malloc. However, in practice many real C++ implementations just use malloc to implement new or have an option to use malloc instead of the alternate approach.

Also higher level languages like C++ tend to have alternative higher level ways of allocating memory like std::vector or std::list. A basic leak detector would report the potentially many allocations made by the higher level modes separately. That's much less useful than saying the entire container was lost.

answered Oct 18 '22 18:10

b4hand

Here's a published technical paper on how our CheckPointer tool works.

Fundamentally it tracks the lifetimes of all values (heap and stack), and their sizes according their types as defined by the language. This allows CheckPointer to find not only leaks, but out-of-array bound accesses, even for arrays in the stack, which valgrind won't do.

In particular, it analyzes the source code to find all pointer uses. (This is quite the task just by itself).

It keeps track of pointer meta data for each pointer, consisting of

A reference to the object meta data for the heap-allocated object or global or local variable orfunction pointed to by the pointer and
The address range of the (sub)object of the object that the pointer may currently access. This may be smaller than the address range of the whole object; e.g. if you take the address of a struct member, the instrumented source code will only allow access to that member when using the resulting pointer.

It also tracks the kind and location of each object, i.e. whether it is a function, a global, thread-local or local variable, heap-allocated memory, or a string literal constant:

The address range of the object that may be safely accessed, and
For each pointer stored in the heap-allocated object or variable, a reference to the pointer metadata for that pointer.

All this tracking is accomplished by transforming the original program source, into a program which does what the original program does, and interleaves various meta-data checking or updating routines. The resulting program is compiled and run. Where a meta-data check fails at runtime, a backtrace is provided with a report of the type of failure (invalid pointer, pointer outside valid bounds, ...)

answered Oct 18 '22 18:10

Ira Baxter

Related questions
                            
                                Which Cross Platform Preprocessor Defines? (__WIN32__ or __WIN32 or WIN32 )?
                            
                                Windows C++ compiler with full C++11 support (should work with Qt) [closed]
                            
                                Can a destructor be recursive?
                            
                                Is `long` guaranteed to be at least 32 bits?
                            
                                C++: is string.empty() always equivalent to string == ""?
                            
                                What language is .NET Framework written in?
                            
                                C++ streams confusion: istreambuf_iterator vs istream_iterator?
                            
                                When to pass by reference and when to pass by pointer in C++?
                            
                                Are pure virtual methods allowed within a template class?
                            
                                Is the "lazy man's enable_if" legal C++?
                            
                                How do you build the x64 Boost libraries on Windows?
                            
                                Easy way find uninitialized member variables
                            
                                Getting bool from C to C++ and back
                            
                                Why not mark everything inline?
                            
                                CMake: How to pass preprocessor macros
                            
                                Why is inherited member not allowed?
                            
                                Why is std::vector so much more popular than std::deque? [duplicate]
                            
                                For { A=a; B=b; }, will "A=a" be strictly executed before "B=b"?
                            
                                String literal matches bool overload instead of std::string
                            
                                Cast Eigen::MatrixXd to Eigen::MatrixXf

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Memory Leak Detectors Working Principle

Tags:

c++

c

memory-leaks

amit1990

People also ask

2 Answers

b4hand

Ira Baxter

Recent Activity

Donate For Us