Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Build Spark SQL query dynamically

How we can pass a column name and operator name dynamically to the SQL query with Spark in Scala?

I tried (unsuccessfully) the following:

spark.sql("set key_tbl=mytable")
spark.sql("select count(1) from ${key_tbl}").collect()
like image 708
user6559130 Avatar asked Feb 25 '18 07:02

user6559130


People also ask

Can we use SQL queries directly in Spark?

Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Usable in Java, Scala, Python and R. Apply functions to results of SQL queries.

Is Spark SQL faster than SQL?

Extrapolating the average I/O rate across the duration of the tests (Big SQL is 3.2x faster than Spark SQL), then Spark SQL actually reads almost 12x more data than Big SQL, and writes 30x more data.

Can you do real time processing with Spark SQL?

Spark Streaming It allows us to build a scalable, high-throughput, and fault-tolerant streaming application of live data streams. Spark Streaming supports the processing of real-time data from various input sources and storing the processed data to various output sinks.

Is Spark SQL faster than Dataframe?

Test results: RDD's outperformed DataFrames and SparkSQL for certain types of data processing. DataFrames and SparkSQL performed almost about the same, although with analysis involving aggregation and sorting SparkSQL had a slight advantage.


1 Answers

You can pass it as parameter as shown below

val param = "tableName" 
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
sqlContext.sql(s"""SELECT * FROM param=$param""")

can check this link for more details https://forums.databricks.com/questions/115/how-do-i-pass-parameters-to-my-sql-statements.html

like image 113
arjunsv3691 Avatar answered Sep 24 '22 12:09

arjunsv3691