Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete multiple pandas (python) dataframes from memory to save RAM?

I have lot of dataframes created as part of preprocessing. Since I have limited 6GB ram, I want to delete all the unnecessary dataframes from RAM to avoid running out of memory when running GRIDSEARCHCV in scikit-learn.

1) Is there a function to list only, all the dataframes currently loaded in memory?

I tried dir() but it gives lot of other object other than dataframes.

2) I created a list of dataframes to delete

del_df=[Gender_dummies,  capsule_trans,  col,  concat_df_list,  coup_CAPSULE_dummies] 

& ran

for i in del_df:     del (i) 

But its not deleting the dataframes. But deleting dataframes individially like below is deleting dataframe from memory.

del Gender_dummies del col 
like image 305
GeorgeOfTheRF Avatar asked Aug 27 '15 11:08

GeorgeOfTheRF


People also ask

How do I reduce panda memory usage?

Changing numeric columns to smaller dtype: Instead, we can downcast the data types. Simply Convert the int64 values as int8 and float64 as float8. This will reduce memory usage.

How do I delete a pandas DataFrame?

To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.

Are pandas DataFrames stored in memory?

You can work with datasets that are much larger than memory, as long as each partition (a regular pandas DataFrame) fits in memory.


1 Answers

del statement does not delete an instance, it merely deletes a name.

When you do del i, you are deleting just the name i - but the instance is still bound to some other name, so it won't be Garbage-Collected.

If you want to release memory, your dataframes has to be Garbage-Collected, i.e. delete all references to them.

If you created your dateframes dynamically to list, then removing that list will trigger Garbage Collection.

>>> lst = [pd.DataFrame(), pd.DataFrame(), pd.DataFrame()] >>> del lst     # memory is released 

If you created some variables, you have to delete them all.

>>> a, b, c = pd.DataFrame(), pd.DataFrame(), pd.DataFrame() >>> lst = [a, b, c] >>> del a, b, c # dfs still in list >>> del lst     # memory release now 
like image 59
pacholik Avatar answered Sep 18 '22 08:09

pacholik