Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pass variables from Scala to Python in Databricks

I'm using Databricks and trying to pass a dataframe from Scala to Python, within the same Scala notebook. I passed a dataframe from Python to Spark using:

%python
python_df.registerTempTable("temp_table")


val scalaDF = table("temp_table")

How do I do that same thing in reverse? Thank you so much!!

like image 392
Ashley O Avatar asked Aug 25 '17 15:08

Ashley O


2 Answers

The reverse will pretty much the same. In Scala:

scalaDF.registerTempTable("some_table")

In Python:

spark.table("some_table")

If you use recent Spark version you should use createOrReplaceTempView in place of registerTempTable.

like image 58
Alper t. Turker Avatar answered Oct 11 '22 20:10

Alper t. Turker


You can make use of the .createOrReplaceTempView() method or sql().

Here is an example to pass a dataframe through from scala, python, onto sql with a modification along the way ...and back to scala.

%scala 
var df = spark.range(0,10).selectExpr("*","'scala' language_origin")
df.createOrReplaceTempView("tableName")
display(df)

%python
df = sql("select * from tableName")
df2 = df.selectExpr("*","'python' language_added")
df2.createOrReplaceTempView("tableName")
display(df2)

%sql
create or replace temp view tableName as
select *, 'sql' language_added from tableName;
select * from tableName

%scala
df = sql("select * from tableName")
display(df)
like image 1
Daniel Mcauley Avatar answered Oct 11 '22 20:10

Daniel Mcauley