I'm trying to track down a memory leak, so I've done
import tracemalloc
tracemalloc.start()
<function call>
# copy pasted this from documentation
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 ]")
for stat in top_stats[:10]:
print(stat)
This shows no major allocations, all memory allocations are pretty small, while I'm seeing 8+ GB memory allocated in ps
and pmap
(checking before and after running the command, and after running garbage collection). Furthermore, tracemalloc.get_traced_memory
confirms that tracemalloc
is not seeing many allocations. pympler
also does not see the allocations.
Does anyone know when this could be the case? Some modules are using cython, could this cause issues for tracemalloc?
In pmap the allocation looks like:
0000000002183000 6492008 6491876 6491876 rw--- [ anon ]
To trace most memory blocks allocated by Python, the module should be started as early as possible by setting the PYTHONTRACEMALLOC environment variable to 1 , or by using -X tracemalloc command line option. The tracemalloc. start() function can be called at runtime to start tracing Python memory allocations.
The default frame count is 1. This value defines the depth of a trace python will capture. The value can be overridden by setting PYTHONTRACEMALLOC environment variable to a desired number. In this example, I am passing a value of “10” to set the count to ten at runtime.
You can use it by putting the @profile decorator around any function or method and running python -m memory_profiler myscript. You'll see line-by-line memory usage once your script exits.
From the documentation on tracemalloc:
The tracemalloc module is a debug tool to trace memory blocks allocated by Python.
In other words, memory not allocated by the python interpreter is not seen by tracemalloc. This would include anything not done by PyMalloc
at the C-API level, including all standard libc malloc
calls by native code used via extensions, or extension code using malloc
directly.
Whether that is the case here is impossible to tell for certain without code to reproduce. You can try running the native code part outside of python through, for example, valgrind, to detect memory leaks in the native code.
If there is cython code doing malloc
, that could be switched to PyMalloc
to have it traced.
An addition to @danny's answer, because it is too long for a comment.
As explained in PEP-464, tracemalloc
uses functionality introduced in PEP-445 for tracking of the memory allocations.
Normally, one would have to use PyMem_RawMalloc
instead of malloc
in order to be able to use tracemalloc
for a C-extension. However, since quite some time also using PyTraceMalloc_Track
and PyTraceMalloc_Untrack
from pymem.h as addition to malloc
(instead of replacing it by PyMem_RawMalloc
).
This is for example what is used in numpy, because in order to be able to wrap raw-c-pointers and take over its ownership numpy used malloc
rather than the python-allocator, which is optimized for small objects - not the most crucial scenario for numpy, as can be seen here:
/*NUMPY_API
* Allocates memory for array data.
*/
NPY_NO_EXPORT void *
PyDataMem_NEW(size_t size)
{
void *result;
result = malloc(size);
if (_PyDataMem_eventhook != NULL) {
NPY_ALLOW_C_API_DEF
NPY_ALLOW_C_API
if (_PyDataMem_eventhook != NULL) {
(*_PyDataMem_eventhook)(NULL, result, size,
_PyDataMem_eventhook_user_data);
}
NPY_DISABLE_C_API
}
PyTraceMalloc_Track(NPY_TRACE_DOMAIN, (npy_uintp)result, size);
return result;
}
So basically, it is a responsibility of the C-extension to report memory allocations to the tracemalloc
-module, on the other hand tracemalloc
cannot be really trusted to register all memory allocations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With