I used to run 32 bit python on a 32-bit OS and whenever i accidentally appended values to an array in an infinite list or tried to load too big of a file, python would just stop with an out of memory error. However, i now use 64-bit python on a 64-bit OS, and instead of giving an exception, python uses up every last bit of memory and causes my computer to freeze up so i am forced to restart it.
I looked around stack overflow and it doesn't seem as if there is a good way to control memory usage or limit memory usage. For example, this solution: How to set memory limit for thread or process in python? limits the resources python can use, but it would be impractical to paste into every piece of code i want to write.
How can i prevent this from happening?
I don't know if this will be the solution for anyone else but me, as my case was very specific, but I thought I'd post it here in case someone could use my procedure.
I was having a VERY huge dataset with millions of rows of data. Once I queried this data through a postgreSQL database I used up a lot of my available memory (63,9 GB available in total on a Windows 10 64 bit PC using Python 3.x 64 bit) and for each of my queries I used around 28-40 GB of memory as the rows of data was to be kept in memory while Python did calculations on the data. I used the psycopg2 module to connect to my postgreSQL.
My initial procedure was to perform calculations and then append the result to a list which I would return in my methods. I quite quickly ended up having too much stored in memory and my PC started freaking out (froze up, logged me out of Windows, display driver stopped responding and etc).
Therefore I changed my approach using Python Generators. And as I would want to store the data I did calculations on back in my database, I would write each row, as I was done performing calculations on it, to my database.
def fetch_rows(cursor, arraysize=1000):
while True:
results = cursor.fetchmany(arraysize)
if not results:
break
for result in results:
yield result
And with this approach I would do calculations on my yielded result by using my generator:
def main():
connection_string = "...."
connection = psycopg2.connect(connection_string)
cursor = connection.cursor()
# Using generator
for row in fecth_rows(cursor):
# placeholder functions
result = do_calculations(row)
write_to_db(result)
This procedure does however indeed require that you have enough physical RAM to store the data in memory.
I hope this helps whomever is out there with same problems.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With