How to convert timestamp column to epoch seconds?

Question

How do you convert a timestamp column to epoch seconds?

var df = sc.parallelize(Seq("2018-07-01T00:00:00Z")).toDF("date_string")
df = df.withColumn("timestamp", $"date_string".cast("timestamp"))
df.show(false)

DataFrame:

+--------------------+---------------------+
|date_string         |timestamp            |
+--------------------+---------------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|
+--------------------+---------------------+

troutinator · Accepted Answer

If you have a timestamp you can cast it to a long to get the epoch seconds

df = df.withColumn("epoch_seconds", $"timestamp".cast("long"))
df.show(false)

DataFrame

+--------------------+---------------------+-------------+
|date_string         |timestamp            |epoch_seconds|
+--------------------+---------------------+-------------+
|2018-07-01T00:00:00Z|2018-07-01 00:00:00.0|1530403200   |
+--------------------+---------------------+-------------+

Shaido · Answer

Use unix_timestamp from org.apache.spark.functions. It can a timestamp column or from a string column where it is possible to specify the format. From the documentation:

public static Column unix_timestamp(Column s)
Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds), using the default timezone and the default locale, return null if fail.
public static Column unix_timestamp(Column s, String p)
Convert time string with given pattern (see http://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html) to Unix time stamp (in seconds), return null if fail.

Use as follows:

import org.apache.spark.functions._

df.withColumn("epoch_seconds", unix_timestamp($"timestamp")))

or if the column is a string with other format:

df.withColumn("epoch_seconds", unix_timestamp($"date_string", "yyyy-MM-dd'T'HH:mm:ss'Z'")))

How to convert timestamp column to epoch seconds?

Tags:

timestamp

scala

apache-spark

apache-spark-sql

troutinator

2 Answers

troutinator

Shaido

Recent Activity

Donate For Us

How to convert timestamp column to epoch seconds?

Tags:

timestamp

scala

apache-spark

apache-spark-sql

troutinator

2 Answers

troutinator

Shaido

Related questions

Recent Activity

Donate For Us