Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy and matplotlib garbage collection

I have a python script which does many simulations for different parameters ( Q, K ), plots results and stores it to disk.

Each set of parameters ( Q,K ) produces a 3D volumetric grid of data 200x200x80 datapoints, which requires ~100 MB of data. A part of this volumetric grid is then plot, layer by layer, producing ~60 images.

The problem is that python obviously does not release memory during this process. I'm not sure where the memory leak is, or what the rules are governing how python decides which objects are deallocated. I'm also not sure if the memory is lost in numpy arrays or in matplotlib figure objects.

  1. Is there a simple way to analyze which objects in python persist in memory and which were automatically deallocated?
  2. Is there a way to force python to deallocate all arrays and figure objects which were created in particular loop cycle or in particular function call?

The relevant part of code is here ( however, it will not run ... the bigger part of the simulation code including ctypes C++/python interface is omitted because it is too complicated ):

import numpy as np
import matplotlib.pyplot as plt
import ProbeParticle as PP # this is my C++/Python simulation library, take it as blackbox

def relaxedScan3D( xTips, yTips, zTips ):
    ntips = len(zTips); 
    print " zTips : ",zTips
    rTips = np.zeros((ntips,3)) # is this array deallocated when exiting the function?
    rs    = np.zeros((ntips,3)) # and this?
    fs    = np.zeros((ntips,3)) # and this?
    rTips[:,0] = 1.0
    rTips[:,1] = 1.0
    rTips[:,2] = zTips 
    fzs    = np.zeros(( len(zTips), len(yTips ), len(xTips ) )); # and this?
    for ix,x in enumerate( xTips  ):
        print "relax ix:", ix
        rTips[:,0] = x
        for iy,y in enumerate( yTips  ):
            rTips[:,1] = y
            itrav = PP.relaxTipStroke( rTips, rs, fs ) / float( len(zTips) )
            fzs[:,iy,ix] = fs[:,2].copy()
    return fzs


def plotImages( prefix, F, slices ):
    for ii,i in enumerate(slices):
        print " plotting ", i
        plt.figure( figsize=( 10,10 ) ) # Is this figure deallocated when exiting the function ?
        plt.imshow( F[i], origin='image', interpolation=PP.params['imageInterpolation'], cmap=PP.params['colorscale'], extent=extent )
        z = zTips[i] - PP.params['moleculeShift' ][2]
        plt.colorbar();
        plt.xlabel(r' Tip_x $\AA$')
        plt.ylabel(r' Tip_y $\AA$')
        plt.title( r"Tip_z = %2.2f $\AA$" %z  )
        plt.savefig( prefix+'_%3.3i.png' %i, bbox_inches='tight' )

Ks = [ 0.125, 0.25, 0.5, 1.0 ]
Qs = [ -0.4, -0.3, -0.2, -0.1, 0.0, +0.1, +0.2, +0.3, +0.4 ]

for iq,Q in enumerate( Qs ):
    FF = FFLJ + FFel * Q
    PP.setFF_Pointer( FF )
    for ik,K in enumerate( Ks ):
        dirname = "Q%1.2fK%1.2f" %(Q,K)
        os.makedirs( dirname )
        PP.setTip( kSpring = np.array((K,K,0.0))/-PP.eVA_Nm )
        fzs = relaxedScan3D( xTips, yTips, zTips ) # is memory of "fzs" recycled or does it consume more memory each cycle of the loop ?
        PP.saveXSF( dirname+'/OutFz.xsf', headScan, lvecScan, fzs )
        dfs = PP.Fz2df( fzs, dz = dz, k0 = PP.params['kCantilever'], f0=PP.params['f0Cantilever'], n=int(PP.params['Amplitude']/dz) ) # is memory of "dfs" recycled?
        plotImages( dirname+"/df", dfs, slices = range( 0, len(dfs) ) )
like image 313
Prokop Hapala Avatar asked Sep 29 '15 13:09

Prokop Hapala


People also ask

Does Python support garbage collection?

Python has an automated garbage collection. It has an algorithm to deallocate objects which are no longer needed. Python has two ways to delete the unused objects from the memory.

Which Python strategy is used for garbage collection?

The main garbage collection mechanism in CPython is through reference counts. Whenever you create an object in Python, the underlying C object has both a Python type (such as list, dict, or function) and a reference count.

What triggers Python garbage collection?

The process by which Python periodically frees and reclaims blocks of memory that no longer are in use is called Garbage Collection. Python's garbage collector runs during program execution and is triggered when an object's reference count reaches zero.

What is Gc collect () Python?

This module provides an interface to the optional garbage collector. It provides the ability to disable the collector, tune the collection frequency, and set debugging options. It also provides access to unreachable objects that the collector found but cannot free.


1 Answers

Try to reuse your figure:

plt.figure(0, figsize=(10, 10))
plt.clf() #clears figure

or close your figure after saving:

...
plt.savefig(...)
plt.close()
like image 151
tillsten Avatar answered Sep 23 '22 00:09

tillsten