Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reduce memory footprint of python program

I'm developing a data analysis worker in python using numpy and pandas. I will deploy lots of these workers so I want to keep it lightweight.

I tried checking with this code:

import logging
import resource
logging.basicConfig(level=logging.DEBUG)

def printmemory(msg):
    currentmemory = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    logging.debug(msg+': total memory:%r Mb' % (int(currentmemory)/1000000.))

printmemory('begin')

#from numpy import array, nan, mean, std, sqrt, square
import numpy  as np
printmemory('numpy')

import pandas  as pd
printmemory('numpy')

and I found out that simply loading them to memory will make my worker pretty heavy. Is there a way to reduce the memory footprint of numpy and pandas?

Otherwise, any suggestion on a better solution?

like image 478
Fra Avatar asked Jan 22 '14 03:01

Fra


People also ask

Why is my Python script using so much memory?

Python does not free memory back to the system immediately after it destroys some object instance. It has some object pools, called arenas, and it takes a while until those are released. In some cases, you may be suffering from memory fragmentation which also causes process' memory usage to grow.

How do I reduce memory usage in programming?

Use #pragma pack(1) to byte align your structures. Use unions where a structure can contain different types of data. Use bit fields rather than ints to store flags and small integers. Avoid using fixed length character arrays to store strings, implement a string pool and use pointers.

Can you control memory with Python?

The Python memory manager is involved only in the allocation of the bytes object returned as a result. In most situations, however, it is recommended to allocate memory from the Python heap specifically because the latter is under control of the Python memory manager.


1 Answers

Sorry to tell you, but there is no way to load into memory only a part of a python module. You could use multi-threading if that applies to your case - threads can share the same module memory.

like image 66
eran Avatar answered Oct 22 '22 05:10

eran