Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to pass variables in spark SQL, using python?

I am writing spark code in python. How do I pass a variable in a spark.sql query?

    q25 = 500
    Q1 = spark.sql("SELECT col1 from table where col2>500 limit $q25 , 1")

Currently the above code does not work? How do we pass variables?

I have also tried,

    Q1 = spark.sql("SELECT col1 from table where col2>500 limit q25='{}' , 1".format(q25))
like image 641
Viv Avatar asked Jun 16 '17 06:06

Viv


People also ask

How do you pass variables in Python SQL?

The syntax for providing a single value can be confusing for inexperienced Python users. Generally*, the value passed to cursor. execute must wrapped in an ordered sequence such as a tuple or list even though the value itself is a singleton, so we must provide a single element tuple, like this: (value,) .

What is %s in Python SQL?

Passing parameters to a SQL statement happens in functions such as Cursor.execute() by using %s placeholders in the SQL statement, and passing a sequence of values as the second argument of the function. For example the Python function call: cur.


4 Answers

You need to remove single quote and q25 in string formatting like this:

Q1 = spark.sql("SELECT col1 from table where col2>500 limit {}, 1".format(q25)) 

Update:

Based on your new queries:

spark.sql("SELECT col1 from table where col2>500 order by col1 desc limit {}, 1".format(q25)) 

Note that the SparkSQL does not support OFFSET, so the query cannot work.

If you need add multiple variables you can try this way:

q25 = 500 var2 = 50 Q1 = spark.sql("SELECT col1 from table where col2>{0} limit {1}".format(var2,q25)) 
like image 182
Tiny.D Avatar answered Sep 30 '22 15:09

Tiny.D


Another option if you're doing this sort of thing often or want to make your code easier to re-use is to use a map of configuration variables and the format option:

configs = {"q25":10,            "TABLE_NAME":"my_table",            "SCHEMA":"my_schema"} Q1 = spark.sql("""SELECT col1 from {SCHEMA}.{TABLE_NAME}                    where col2>500                    limit {q25}                """.format(**configs)) 
like image 24
David Maddox Avatar answered Sep 30 '22 15:09

David Maddox


A really easy solution is to store the query as a string (using the usual python formatting), and then pass it to the spark.sql() function:

q25 = 500

query = "SELECT col1 from table where col2>500 limit {}".format(q25)

Q1 = spark.sql(query)
like image 28
user6386471 Avatar answered Sep 30 '22 16:09

user6386471


All you need to do is add s (String interpolator) to the string. This allows the usage of variable directly into the string.

val q25 = 10
Q1 = spark.sql(s"SELECT col1 from table where col2>500 limit $q25)
like image 29
Deepesh Kumar Avatar answered Sep 30 '22 15:09

Deepesh Kumar