I'm learning Spark, got confused about Spark's Catalog.
I found a catalog in SparkSession, which is an instance of CatalogImpl, as below
/**
* Interface through which the user may create, drop, alter or query underlying
* databases, tables, functions etc.
*
* @since 2.0.0
*/
@transient lazy val catalog: Catalog = new CatalogImpl(self)
And I found that there is a catalog in SparkSession.sessionState, which is an instance of SessionCatalog.
What's the difference between them?
What's the difference between them?
tl;dr None.
The line in CatalogImpl is the missing piece in your understanding:
private def sessionCatalog: SessionCatalog = sparkSession.sessionState.catalog
In other words, SparkSession.catalog creates a CatalogImpl that uses sparkSession.sessionState.catalog under the covers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With