Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is sharing cache/persisted dataframes between databricks notebook possible?

I want to cache a table(dataframe) in one notebook and use it in another notebook , I am using same databricks cluster for both the notebooks.

Please suggest if this is possible , If yes then how ?

like image 637
Shad Khan Avatar asked Mar 18 '26 04:03

Shad Khan


2 Answers

You can share dataframe between notebooks.

On the first notebook please register it as temp view:

df_shared.createOrReplaceGlobalTempView("df_shared")

On the second notebook please read it from global temp database:

global_temp_db = spark.conf.get("spark.sql.globalTempDatabase")
df_shared= table(global_temp_db + ".df_shared")
like image 64
Hubert Dudek Avatar answered Mar 19 '26 18:03

Hubert Dudek


Yes it is possible based on the following setups .

You can register your dataframe as temp table . The lifetime of temp view created by createOrReplaceTempView() is tied to Spark Session in which the dataframe has been created.

spark.databricks.session.share to true

this setup global temporary views to share temporary views across notebooks. ref : link

like image 40
Karthikeyan Rasipalay Durairaj Avatar answered Mar 19 '26 19:03

Karthikeyan Rasipalay Durairaj



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!