Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When would the python tracemalloc module allocations statistics not match what's shown in ps or pmap?

I'm trying to track down a memory leak, so I've done

import tracemalloc
tracemalloc.start()

<function call>

# copy pasted this from documentation
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')

print("[ Top 10 ]")
for stat in top_stats[:10]:
    print(stat)

This shows no major allocations, all memory allocations are pretty small, while I'm seeing 8+ GB memory allocated in ps and pmap (checking before and after running the command, and after running garbage collection). Furthermore, tracemalloc.get_traced_memory confirms that tracemalloc is not seeing many allocations. pympler also does not see the allocations.

Does anyone know when this could be the case? Some modules are using cython, could this cause issues for tracemalloc?

In pmap the allocation looks like:

0000000002183000 6492008 6491876 6491876 rw--- [ anon ]

like image 208
RyanCheu Avatar asked May 03 '18 06:05

RyanCheu


People also ask

How to use tracemalloc in python?

To trace most memory blocks allocated by Python, the module should be started as early as possible by setting the PYTHONTRACEMALLOC environment variable to 1 , or by using -X tracemalloc command line option. The tracemalloc. start() function can be called at runtime to start tracing Python memory allocations.

What is count in Tracemalloc?

The default frame count is 1. This value defines the depth of a trace python will capture. The value can be overridden by setting PYTHONTRACEMALLOC environment variable to a desired number. In this example, I am passing a value of “10” to set the count to ten at runtime.

How do I check memory usage in python?

You can use it by putting the @profile decorator around any function or method and running python -m memory_profiler myscript. You'll see line-by-line memory usage once your script exits.


2 Answers

From the documentation on tracemalloc:

The tracemalloc module is a debug tool to trace memory blocks allocated by Python.

In other words, memory not allocated by the python interpreter is not seen by tracemalloc. This would include anything not done by PyMalloc at the C-API level, including all standard libc malloc calls by native code used via extensions, or extension code using malloc directly.

Whether that is the case here is impossible to tell for certain without code to reproduce. You can try running the native code part outside of python through, for example, valgrind, to detect memory leaks in the native code.

If there is cython code doing malloc, that could be switched to PyMalloc to have it traced.

like image 140
danny Avatar answered Oct 12 '22 03:10

danny


An addition to @danny's answer, because it is too long for a comment.

As explained in PEP-464, tracemalloc uses functionality introduced in PEP-445 for tracking of the memory allocations.

Normally, one would have to use PyMem_RawMalloc instead of malloc in order to be able to use tracemalloc for a C-extension. However, since quite some time also using PyTraceMalloc_Track and PyTraceMalloc_Untrack from pymem.h as addition to malloc(instead of replacing it by PyMem_RawMalloc).

This is for example what is used in numpy, because in order to be able to wrap raw-c-pointers and take over its ownership numpy used malloc rather than the python-allocator, which is optimized for small objects - not the most crucial scenario for numpy, as can be seen here:

/*NUMPY_API
 * Allocates memory for array data.
 */
NPY_NO_EXPORT void *
PyDataMem_NEW(size_t size)
{
    void *result;

    result = malloc(size);
    if (_PyDataMem_eventhook != NULL) {
        NPY_ALLOW_C_API_DEF
        NPY_ALLOW_C_API
        if (_PyDataMem_eventhook != NULL) {
            (*_PyDataMem_eventhook)(NULL, result, size,
                                    _PyDataMem_eventhook_user_data);
        }
        NPY_DISABLE_C_API
    }
    PyTraceMalloc_Track(NPY_TRACE_DOMAIN, (npy_uintp)result, size);
    return result;
}

So basically, it is a responsibility of the C-extension to report memory allocations to the tracemalloc-module, on the other hand tracemalloc cannot be really trusted to register all memory allocations.

like image 22
ead Avatar answered Oct 12 '22 01:10

ead