Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete Pyspark Dataframe

Tags:

python

pyspark

I am working with extremely large datasets, so I need to remove any intermediate dataframe. How do I ensure that any dataframe that I don't need is deleted from memory/disk?

like image 294
Harsh Kumar Avatar asked Nov 16 '25 20:11

Harsh Kumar


1 Answers

You should use spark.catalog.clearCache

https://spark.apache.org/docs/2.1.0/api/java/org/apache/spark/sql/catalog/Catalog.html

like image 82
Konrad Kostrzewa Avatar answered Nov 18 '25 11:11

Konrad Kostrzewa



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!