I'm new to Spark SQL and am trying to convert a string to a timestamp in a spark data frame. I have a string that looks like '2017-08-01T02:26:59.000Z'
in a column called time_string
My code to convert this string to timestamp is
CAST (time_string AS Timestamp)
But this gives me a timestamp of 2017-07-31 19:26:59
Why is it changing the time? Is there a way to do this without changing the time?
Thanks for any help!
Syntax – to_timestamp() This function has two signatures, the first signature takes just one argument and the argument should be in Timestamp format MM-dd-yyyy HH:mm:ss. SSS , when the format is not in this format, it returns null. Related: Refer to Spark SQL Date and Timestamp Functions for all Date & Time functions.
Use the PostgreSQL function TO_TIMESTAMP() when you want to convert a string containing date and time data to the timestamp data type.
To convert into TimestampType apply to_timestamp(timestamp, 'yyyy/MM/dd HH:mm:ss). It is need to make sure the format for timestamp is same as your column value. Then you apply date_format to convert it as per your requirement.
The to_timestamp() function in Apache PySpark is popularly used to convert String to the Timestamp(i.e., Timestamp Type). The default format of the Timestamp is "MM-dd-yyyy HH:mm: ss. SSS," and if the input is not in the specified form, it returns Null.
You could use unix_timestamp function to convert the utc formatted date to timestamp
val df2 = Seq(("a3fac", "2017-08-01T02:26:59.000Z")).toDF("id", "eventTime")
df2.withColumn("eventTime1", unix_timestamp($"eventTime", "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'").cast(TimestampType))
Output:
+-------------+---------------------+
|userid |eventTime |
+-------------+---------------------+
|a3fac |2017-08-01 02:26:59.0|
+-------------+---------------------+
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With