Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I access python variable in Spark SQL?

I have python variable created under %python in my jupyter notebook file in Azure Databricks. How can I access the same variable to make comparisons under %sql. Below is the example:

%python

RunID_Goal = sqlContext.sql("SELECT CONCAT(SUBSTRING(RunID,1,6),SUBSTRING(RunID,1,6),'01_') 
FROM RunID_Pace").first()[0] 
AS RunID_Goal
%sql
SELECT Type , KPIDate, Value
FROM table
WHERE
RunID = RunID_Goal (This is the variable created under %python and want to compare over here)

When I run this it throws an error: Error in SQL statement: AnalysisException: cannot resolve 'RunID_Goal' given input columns: I am new azure databricks and spark sql any sort of help would be appreciated.

like image 512
Haseeb Ahmed Khan Avatar asked Oct 20 '25 04:10

Haseeb Ahmed Khan


2 Answers

One workaround could be to use Widgets to pass parameters between cells. For example, on Python side it could be as following:

# generate test data
import pyspark.sql.functions as F
spark.range(100).withColumn("rnd", F.rand()).write.mode("append").saveAsTable("abc")

# set widgets
import random
vl = random.randint(0, 100)
dbutils.widgets.text("my_val", str(vl))

and then you can refer the value from the widget inside the SQL code:

%sql
select * from abc where id = getArgument('my_val')

will give you:

enter image description here

Another way is to pass variable via Spark configuration. You can set variable value like this (please note that that the variable should have a prefix - in this case it's c.):

spark.conf.set("c.var", "some-value")

and then from SQL refer to variable as ${var-name}:

%sql 
select * from table where column = '${c.var}'

One advantage of this is that you can use this variable also for table names, etc. Disadvantage is that you need to do the escaping of the variable, like putting into single quotes for string values.

like image 142
Alex Ott Avatar answered Oct 23 '25 01:10

Alex Ott


You cannot access this variable. It is explained in the documentation:

When you invoke a language magic command, the command is dispatched to the REPL in the execution context for the notebook. Variables defined in one language (and hence in the REPL for that language) are not available in the REPL of another language. REPLs can share state only through external resources such as files in DBFS or objects in object storage.

like image 23
Steven Avatar answered Oct 23 '25 01:10

Steven



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!