I sometimes work with very large data sets in IPython Notebooks. Sometimes a single pandas DataFrame will take up 1+GB of memory, so I can't afford to keep many copies around.
What I've found is that if I try to perform an operation on such a matrix, and an error is raised, I don't get the memory back - some intermittent variable is still being tracked somewhere. The problem is, I don't know where and can't free it up!
For example, the image below shows the memory consumption after repeated attempts to execute the cell (each step in the graph corresponds to an attempt to execute the cell). Each time a new block of memory is consumed that is never released.
Does anyone know where this memory is going and how to free it up? Alternatively, if this is a bug (i.e. memory leak or similar), how do you show that? I didn't want to report this as a bug if it is actually the side effect of code performing as designed (e.g. IPython is caching things and I'm just abusing the caching system).
Thank you!
Per discussion on github concerning issue 642, there is a known memory leak in jsonschema 2.4. After updating to jsonschema 2.5.1, I have no longer had this problem.
So, if you're using an older framework and seeing this issue, you will need to upgrade at least jsonschema.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With