dynamically bind variable/parameter in Spark SQL?

Tags:

How to bind variable in Apache Spark SQL? For example:

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)

744

asked Nov 05 '14 10:11

user3769729

2 Answers

Spark SQL (as of 1.6 release) does not support bind variables.

ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Every time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).

Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.

answered Oct 05 '22 16:10

Tagar

I verified it in both Spark shell 2.x shell and Thrift(beeline) as well. I could able to bind a variable in Spark SQL query with set command.

Query without bind variable:

select count(1) from mytable;

Query with bind variable (parameterized):

1. Spark SQL shell

 set key_tbl=mytable; -- setting mytable to key_tbl to use as ${key_tbl}
 select count(1) from ${key_tbl};

2. Spark shell

spark.sql("set key_tbl=mytable")
spark.sql("select count(1) from ${key_tbl}").collect()

Both w/w.o bind params the query returns an identical result.

Note: Don't give any quotes to the value of key as it's table name here.

Let me know if there are any questions.

answered Oct 05 '22 17:10

5 revs

Related questions
                            
                                Using collect on maps in Scala
                            
                                akka jvm threads vs os threads when performing io
                            
                                What Scala feature allows the plus operator to be used on Any?
                            
                                "return" and "try-catch-finally" block evaluation in scala
                            
                                How to generate sources in an sbt plugin?
                            
                                Spark Error: Not enough space to cache partition rdd_8_2 in memory! Free memory is 58905314 bytes
                            
                                What precisely is a scala evidence parameter
                            
                                Spark SQL filter multiple fields
                            
                                Decoding structured JSON arrays with circe in Scala
                            
                                Run ScalaTest tests in parallel
                            
                                How to exclude resources during packaging with SBT but not during testing
                            
                                Use Spark to list all files in a Hadoop HDFS directory?
                            
                                Using scala.sys.process with timeout
                            
                                `implicit' modifier cannot be used for top-level objects
                            
                                Include Dependencies in JAR using SBT package
                            
                                What do the % and %% operators do when setting up SBT dependencies?
                            
                                Idiomatic use of a Java comparable object
                            
                                How to run an external file from within the scala interactive interpreter (REPL)?
                            
                                Conversion of Scala map containing Boolean to Java map containing java.lang.Boolean
                            
                                How to get the body of post request in Scalatra?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

dynamically bind variable/parameter in Spark SQL?

Tags:

scala

apache-spark

apache-spark-sql

apache-spark-2.0

user3769729

People also ask

2 Answers

Tagar

5 revs

Recent Activity

Donate For Us