I have a DF containing timestamps:
0 2005-08-31 16:39:40
1 2005-12-28 16:00:34
2 2005-10-21 17:52:10
3 2014-01-28 12:23:15
4 2014-01-28 12:23:15
5 2011-02-04 18:32:34
6 2011-02-04 18:32:34
7 2011-02-04 18:32:34
I would like to extract the year from each timestamp, creating additional column in the DF that would look like:
0 2005-08-31 16:39:40 2005
1 2005-12-28 16:00:34 2005
2 2005-10-21 17:52:10 2005
3 2014-01-28 12:23:15 2014
4 2014-01-28 12:23:15 2014
5 2011-02-04 18:32:34 2011
6 2011-02-04 18:32:34 2011
7 2011-02-04 18:32:34 2011
Obviously I can go over all DF entries stripping off the first 4 characters of the date. Which is very slow. I wonder if there is a fast python-way to do this. I saw that it's possible to convert the column into the datetime format by DF = pd.to_datetime(DF,'%Y-%m-%d %H:%M:%S') but when I try to then apply datetime.datetime.year(DF) it doesn't work. I will also need to parse the timestamps to months and combinations of years-months and so on... Help please. Thanks.
No need to apply a function for each row there is a new datetime accessor you can call to access the year property:
In [35]:
df1['year'] = df1['timestamp'].dt.year
df1
Out[35]:
timestamp year
0 2005-08-31 16:39:40 2005
1 2005-12-28 16:00:34 2005
2 2005-10-21 17:52:10 2005
3 2014-01-28 12:23:15 2014
4 2014-01-28 12:23:15 2014
5 2011-02-04 18:32:34 2011
6 2011-02-04 18:32:34 2011
7 2011-02-04 18:32:34 2011
If your timestamps are str then you can convert to datetime64 using pd.to_dateime
:
df['timestamp'] = pd.to_datetime(df['timestamp'])
You can access the months and other attributes using dt
like the above.
For version prior to 0.15.0
you can perform the following:
df1['year'] = df1['timestamp'].apply(lambda x: x.year)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With