I'm writing a very simple script that reads a fairly large file (3M lines, 1.1G file) that contains litteral (str) expression of polynomial. I then use Sympy for some symbolic calculation and write results to 16 separate files.
My script, as it runs, takes an increasing memory space (> 20 Gb), and I can't understand why. Would you see any way to improve the memory usage of that script ?
from sympy import sympify
from sympy.abc import x,y
from sympy import degree
fin = open("out_poly","r")
A = fin.readlines()
fin.close()
deg = 4
fou = [open("coeff_x"+str(i)+"y"+str(k),"w") for i in range(deg+1) for k in range(deg+1-i)]
for line in A:
  expr = line.replace("^","**").replace("x0","x").replace("x1","y")
  exprsy = sympify(expr)
  cpt = 0
  for i in range(deg+1):
    for k in range(deg+1-i):
      fou[cpt].write(str(exprsy.coeff(x,i).coeff(y,k))+"\n")
      cpt = cpt+1
for files in fou:
  files.close()
Found it! The culprit was... Sympy!
Sympy caches expressions and fills up the memory. The problem can be solved either by setting up the environment variable SYMPY_NO_CACHE=no, but it can seriously affect Sympy performance. A better alternative is to import the following Sympy extension:
from sympy.core.cache import *
and clear up the cache in your code at adequate intervals:
clear_cache()
With those commands at each iteration in my code, the memory usage is stable and constant at only 26 Mo.
Links about the issue: http://code.google.com/p/sympy/issues/detail?id=3222
Links about Sympy cache: https://github.com/sympy/sympy/wiki/faq
Thanks all for your help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With