Good Evening,
I have a dataframe(Pandas), with a column representing dates, in the following format:
print(df["date"])
14/01/18 12:47
14/01/18 12:48
14/01/18 12:50
14/01/18 12:57
14/01/18 12:57
14/01/18 12:57
14/01/18 12:57
14/01/18 12:57
14/01/18 12:58
Specifically, I would like to: 1. Convert it to datetime, using pd.to_datetime 2. Create the following additional columns:
df["month"]
df["day"]
df["year"]
df["hour"]
df["minute"]
I tried to run:
df['date'] = pd.to_datetime(df['date'], format = "%d/%m/%Y %H/%M" )
But the following error appears:
time data '02/01/18 08:41' does not match format '%d/%m/%Y %H/%M' (match)
dayfirst=True is not strict, but will prefer to parse with day first. If a delimited date string cannot be parsed in accordance with the given dayfirst option, e.g. to_datetime(['31-12-2021']) , then a warning will be shown. yearfirstbool, default False. Specify a date parse order if arg is str or is list-like.
dt. date attribute to return the date property of the underlying data of the given Series object.
The format you want is '%d/%m/%y %H:%M'
(lowercase y and colon between hour and minute). Take a look here.
Then you can create the other columns:
df['month'] = df['date'].apply(lambda x: x.month)
df['day'] = df['date'].apply(lambda x: x.day)
df['year'] = df['date'].apply(lambda x: x.year)
df['hour'] = df['date'].apply(lambda x: x.hour)
df['minute'] = df['date'].apply(lambda x: x.minute)
Alternatively to grovina's answer ... instead of using apply you can directly use the dt
accessor.
Here's a sample:
>>> data = [['2017-12-01'], ['2017-12-30'],['2018-01-01']]
>>> df = pd.DataFrame(data=data, columns=['date'])
>>> df
date
0 2017-12-01
1 2017-12-30
2 2018-01-01
>>> df.date
0 2017-12-01
1 2017-12-30
2 2018-01-01
Name: date, dtype: object
Note how df.date is an object? Let's turn it into a date like you want
>>> df.date = pd.to_datetime(df.date)
>>> df.date
0 2017-12-01
1 2017-12-30
2 2018-01-01
Name: date, dtype: datetime64[ns]
The format you want is for string formatting. I don't think you'll be able to convert the actual datetime64 to look like that format. For now, let's make a newly formatted string version of your date in a separate column
>>> df['new_formatted_date'] = df.date.dt.strftime('%d/%m/%y %H:%M')
>>> df.new_formatted_date
0 01/12/17 00:00
1 30/12/17 00:00
2 01/01/18 00:00
Name: new_formatted_date, dtype: object
Finally, since the df.date column is now of date datetime64... you can use the dt
accessor right on it. No need to use apply
>>> df['month'] = df.date.dt.month
>>> df['day'] = df.date.dt.day
>>> df['year'] = df.date.dt.year
>>> df['hour'] = df.date.dt.hour
>>> df['minute'] = df.date.dt.minute
>>> df
date new_formatted_date month day year hour minute
0 2017-12-01 01/12/17 00:00 12 1 2017 0 0
1 2017-12-30 30/12/17 00:00 12 30 2017 0 0
2 2018-01-01 01/01/18 00:00 1 1 2018 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With