I have a dataframe called train
, he has the following schema :
root
|-- date_time: string (nullable = true)
|-- site_name: integer (nullable = true)
|-- posa_continent: integer (nullable = true)
I want to cast the date_time
column to timestamp
and create a new column with the year
value extracted from the date_time
column.
To be clear, I have the following dataframe :
+-------------------+---------+--------------+
| date_time|site_name|posa_continent|
+-------------------+---------+--------------+
|2014-08-11 07:46:59| 2| 3|
|2014-08-11 08:22:12| 2| 3|
|2015-08-11 08:24:33| 2| 3|
|2016-08-09 18:05:16| 2| 3|
|2011-08-09 18:08:18| 2| 3|
|2009-08-09 18:13:12| 2| 3|
|2014-07-16 09:42:23| 2| 3|
+-------------------+---------+--------------+
I want to get the following dataframe :
+-------------------+---------+--------------+--------+
| date_time|site_name|posa_continent|year |
+-------------------+---------+--------------+--------+
|2014-08-11 07:46:59| 2| 3|2014 |
|2014-08-11 08:22:12| 2| 3|2014 |
|2015-08-11 08:24:33| 2| 3|2015 |
|2016-08-09 18:05:16| 2| 3|2016 |
|2011-08-09 18:08:18| 2| 3|2011 |
|2009-08-09 18:13:12| 2| 3|2009 |
|2014-07-16 09:42:23| 2| 3|2014 |
+-------------------+---------+--------------+--------+
Spark to_date() – Convert String to Date format to_date() – function is used to format string ( StringType ) to date ( DateType ) column. Below code, snippet takes the date in a string and converts it to date format on DataFrame.
Use <em>to_timestamp</em>() function to convert String to Timestamp (TimestampType) in PySpark. The converted time would be in a default format of MM-dd-yyyy HH:mm:ss.
current_timestamp(): This function returns the current timestamp in the apache spark. The default format produced is in yyyy-MM-dd HH:mm:ss.
Well, if you want to cast the date_timecolumn to timestampand create a new column with the year value then do exactly that:
import org.apache.spark.sql.functions.year
df
.withColumn("date_time", $"date_time".cast("timestamp")) // cast to timestamp
.withColumn("year", year($"date_time")) // add year column
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With