Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert date from String to Date format in Dataframes

I am trying to convert a column which is in String format to Date format using the to_date function but its returning Null values.

df.createOrReplaceTempView("incidents") spark.sql("select Date from incidents").show()  +----------+ |      Date| +----------+ |08/26/2016| |08/26/2016| |08/26/2016| |06/14/2016|  spark.sql("select to_date(Date) from incidents").show()  +---------------------------+ |to_date(CAST(Date AS DATE))|  +---------------------------+ |                       null| |                       null| |                       null| |                       null| 

The Date column is in String format:

 |-- Date: string (nullable = true) 
like image 551
Ishan Kumar Avatar asked Nov 23 '16 11:11

Ishan Kumar


People also ask

How do I convert a string to a date?

Using strptime() , date and time in string format can be converted to datetime type. The first parameter is the string and the second is the date time format specifier. One advantage of converting to date format is one can select the month or date or time individually.

How do you convert a string to a date in PySpark?

PySpark to_date() – Convert String to Date Format to_date() – function is used to format string ( StringType ) to date ( DateType ) column. This function takes the first argument as a date string and the second argument takes the pattern the date is in the first argument.

How to convert string to datetime format in pandas Dataframe?

In order to be able to work with it, we are required to convert the dates into the datetime format. Code #1 : Convert Pandas dataframe column type from string to datetime format using pd.to_datetime() function. Output : As we can see in the output, the data type of the ‘Date’ column is object i.e. string.

How to change the format of ‘Date’ column in a Dataframe?

As we can see in the output, the data type of the ‘Date’ column is object i.e. string. Now we will convert it to datetime format using DataFrame.astype () function. As we can see in the output, the format of the ‘Date’ column has been changed to the datetime format.

How to convert string to date in SQL Server?

In SQL Server, converting string to date implicitly depends on the string date format and the default language settings (regional settings); If the date stored within a string is in ISO formats: yyyyMMdd or yyyy-MM-ddTHH:mm:ss (.mmm), it can be converted regardless of the regional settings, else the date must have a supported format ...

How to convert string to datetime in Python?

You may refer to the following source for the different formats that you may apply. For our example, the complete Python code to convert the strings to datetime would be: import pandas as pd values = {'dates': ['20190902','20190913','20190921'], 'status': ['Opened','Opened','Closed'] } df = pd.DataFrame (values, ...


Video Answer


2 Answers

Use to_date with Java SimpleDateFormat.

TO_DATE(CAST(UNIX_TIMESTAMP(date, 'MM/dd/yyyy') AS TIMESTAMP)) 

Example:

spark.sql("""   SELECT TO_DATE(CAST(UNIX_TIMESTAMP('08/26/2016', 'MM/dd/yyyy') AS TIMESTAMP)) AS newdate""" ).show()  +----------+ |        dt| +----------+ |2016-08-26| +----------+ 
like image 106
5 revs, 4 users 81%user6022341 Avatar answered Oct 19 '22 21:10

5 revs, 4 users 81%user6022341


I solved the same problem without the temp table/view and with dataframe functions.

Of course I found that only one format works with this solution and that's yyyy-MM-DD.

For example:

val df = sc.parallelize(Seq("2016-08-26")).toDF("Id") val df2 = df.withColumn("Timestamp", (col("Id").cast("timestamp"))) val df3 = df2.withColumn("Date", (col("Id").cast("date")))  df3.printSchema  root  |-- Id: string (nullable = true)  |-- Timestamp: timestamp (nullable = true)  |-- Date: date (nullable = true)  df3.show  +----------+--------------------+----------+ |        Id|           Timestamp|      Date| +----------+--------------------+----------+ |2016-08-26|2016-08-26 00:00:...|2016-08-26| +----------+--------------------+----------+ 

The timestamp of course has 00:00:00.0 as a time value.

like image 40
V. Samma Avatar answered Oct 19 '22 21:10

V. Samma