So if I have a timestamp in pandas as such:
Timestamp('2014-11-07 00:05:00')
How can I create a new column that just has the 'time' component?
So I want
00:05:00
Currently, I'm using .apply
as shown below, but this is slow (my dataframe is a couple million rows), and i'm looking for a faster way.
df['time'] = df['date_time'].apply(lambda x: x.time())
Instead of .apply
, I tried using .astype(time)
, as I noticed .astype
operations can be faster than .apply
, but that apparently doesn't work on timestamps (AttributeError: 'Timestamp' object has no attribute 'astype')... any ideas?
Itertuples convert the data frame to a list of tuples, then iterates through it, which makes it comparatively faster. Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow.
The results show that apply massively outperforms iterrows . As mentioned previously, this is because apply is optimized for looping through dataframe rows much quicker than iterrows does. While slower than apply , itertuples is quicker than iterrows , so if looping is required, try implementing itertuples instead.
Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple '–' operator. Given time can be converted to pandas timestamp using pandas. Timestamp() method.
You want .dt.time
see the docs for some more examples of things under the .dt
accessor.
df['date_time'].dt.time
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With