Considering the pySpark documentation for SQLContext says "As of Spark 2.0, this is replaced by SparkSession."
How can I remove all cached tables from the in-memory cache without using SQLContext?
For example, where spark is a SparkSession and sc is a sparkContext:
from pyspark.sql import SQLContext
SQLContext(sc, spark).clearCache()
I don't think that clearCache is available elsewhere except SQLContext in pyspark. The example below create an instance using SQLContext.getOrCreate using an existing SparkContext instance:
SQLContext.getOrCreate(sc).clearCache()
In scala though there is an easier way to achieve the same directly via SparkSession:
spark.sharedState.cacheManager.clearCache()
One more option through the catalog as Clay mentioned:
spark.catalog.clearCache
And the last one from Jacek Laskowski's gitbooks:
spark.sql("CLEAR CACHE").collect
Reference: https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-caching-and-persistence.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With