Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

64 bit python fills up memory until computer freezes with no memerror

I used to run 32 bit python on a 32-bit OS and whenever i accidentally appended values to an array in an infinite list or tried to load too big of a file, python would just stop with an out of memory error. However, i now use 64-bit python on a 64-bit OS, and instead of giving an exception, python uses up every last bit of memory and causes my computer to freeze up so i am forced to restart it.

I looked around stack overflow and it doesn't seem as if there is a good way to control memory usage or limit memory usage. For example, this solution: How to set memory limit for thread or process in python? limits the resources python can use, but it would be impractical to paste into every piece of code i want to write.

How can i prevent this from happening?

like image 974
shimao Avatar asked Dec 15 '13 07:12

shimao


1 Answers

I don't know if this will be the solution for anyone else but me, as my case was very specific, but I thought I'd post it here in case someone could use my procedure.

I was having a VERY huge dataset with millions of rows of data. Once I queried this data through a postgreSQL database I used up a lot of my available memory (63,9 GB available in total on a Windows 10 64 bit PC using Python 3.x 64 bit) and for each of my queries I used around 28-40 GB of memory as the rows of data was to be kept in memory while Python did calculations on the data. I used the psycopg2 module to connect to my postgreSQL.

My initial procedure was to perform calculations and then append the result to a list which I would return in my methods. I quite quickly ended up having too much stored in memory and my PC started freaking out (froze up, logged me out of Windows, display driver stopped responding and etc).

Therefore I changed my approach using Python Generators. And as I would want to store the data I did calculations on back in my database, I would write each row, as I was done performing calculations on it, to my database.

def fetch_rows(cursor, arraysize=1000):
    while True:
        results = cursor.fetchmany(arraysize)
        if not results:
            break
        for result in results:
            yield result

And with this approach I would do calculations on my yielded result by using my generator:

def main():
    connection_string = "...."
    connection = psycopg2.connect(connection_string)
    cursor = connection.cursor()

    # Using generator
    for row in fecth_rows(cursor):
        # placeholder functions
        result = do_calculations(row) 
        write_to_db(result)

This procedure does however indeed require that you have enough physical RAM to store the data in memory.

I hope this helps whomever is out there with same problems.

like image 120
Zeliax Avatar answered Oct 10 '22 11:10

Zeliax