Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of lit() in expr()

The line:

df.withColumn("test", expr("concat(lon, lat)")) 

works as expected but

df.withColumn("test", expr("concat(lon, lit(','), lat)"))

produces the following exception:

org.apache.spark.sql.AnalysisException: Undefined function: 'lit'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 12 at org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions$$anonfun$apply$15$$anonfun$applyOrElse$49.apply(Analyzer.scala:1198)

Why? And what would be the workaround?

like image 686
Kyunam Avatar asked Nov 08 '18 02:11

Kyunam


People also ask

What is the use of lit?

'Lit' has been a slang term meaning "intoxicated" for over a century. More recently, it has acquired the meaning "exciting," as well as a broader meaning along the lines of "excellent."

What is the use of lit in Spark?

lit. Creates a Column of literal value.

What is PySpark EXPR?

PySpark expr() is a SQL function to execute SQL-like expressions and to use an existing DataFrame column value as an expression argument to Pyspark built-in functions. Most of the commonly used SQL functions are either part of the PySpark Column class or built-in pyspark.


1 Answers

The string argument to expr will be parsed as a SQL expression and used to construct a column. Since lit is not a valid SQL command this will give you an error. (lit is used in Spark to convert a literal value into a new column.)

To solve this, simply remove the lit part:

df.withColumn("test", expr("concat(lon, ',', lat)")) 

Or use the in-built Spark concat function directly without expr:

df.withColumn("test", concat($"lon", lit(","), $"lat"))

Since concat takes columns as arguments lit must be used here.

like image 161
Shaido Avatar answered Oct 25 '22 06:10

Shaido