I have a column with dates as character in the format 10/17/2017 12:00:00 AM
. I want parse the string and keep only the date part as class Date
, i.e. 2017-10-17
. I am using -
df$ReportDate = as.Date(df$ReportDate, format = "%m/%d/%Y %I:%M:%S %p")
df$ReportDate = as.Date(format(df$ReportDate, "%Y-%m-%d"))
this works, but the dataframe has over 5 million rows so this takes close to two mins.
user system elapsed
104.73 0.55 105.46
Is there a faster and more efficient way to do this?
parse() The Date. parse() method parses a string representation of a date, and returns the number of milliseconds since January 1, 1970, 00:00:00 UTC or NaN if the string is unrecognized or, in some cases, contains illegal date values (e.g. 2015-02-31). Only the ISO 8601 format ( YYYY-MM-DDTHH:mm:ss.
The strftime() method takes one or more format codes as an argument and returns a formatted string based on it. We imported datetime class from the datetime module. It's because the object of datetime class can access strftime() method. The datetime object containing current date and time is stored in now variable.
Note that as.Date
will ignore junk after the date so this takes less than 10 seconds on my not particularly fast laptop:
xx <- rep("10/17/2017 12:00:00 AM", 5000000) # test input
system.time(as.Date(xx, "%m/%d/%Y"))
## user system elapsed
## 9.57 0.20 9.82
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With