Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove milliseconds in timestamp spark sql

I want to remove the milli seconds part when selecting the column through spark sql.

Ex: 2012-10-17 13:02:50.320

I want the result as 2012-10-17 13:02:50 I tried

spark.sql("select cast(datecol as timestamp) from table 
spark.sql("select unix_timestamp(datecol,"yyyy-MM-dd HH:mm:ss") from table

Both seems not working, substring works but I need timestamp format ,Is there an other way to do it?

Thanks in advance

like image 857
Babu Avatar asked Sep 21 '17 20:09

Babu


2 Answers

For everyone who is looking for a solution with spark DataFrame methods: In case your column is of type Timestamp and not String, you can use the date_trunc("second", column) function:

// remove milliseconds of datetime column
val df2 = df.withColumn("datetime", date_trunc("second", col("datetime")))
like image 141
Salim Avatar answered Nov 19 '22 04:11

Salim


As your timestamp value is string and you are casting it to timestamp, you can try it using substring function.

Second option :

spark.sql("select from_unixtime(unix_timestamp(datecol, 'yyyy-MM-dd HH:mm:ss.SSS'),'yyyy-MM-dd HH:mm:ss') from table")

You were not providing the input format, that may be the reason you are getting the error.

I hope, this will work.

Thanks, Manu

like image 3
Manu Gupta Avatar answered Nov 19 '22 05:11

Manu Gupta