Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort by timestamps in pandas?

So, I have timestamps that look like the following:

20140804:10:00:13.281486

20140804:10:00:13.400113

20140804:10:00:13.555512

20140804:10:00:13.435677

I have them in a DataFrame and I am trying to sort them in an ascending order. I have tried the following. But, it doesn't seem to work

df['yyyymmdd'] = pd.to_numeric(df['yyyymmdd'], errors='coerce')

df['hh'] = pd.to_numeric(df['hh'], errors='coerce')

df['mm'] = pd.to_numeric(df['mm'], errors='coerce')

df['ss'] = pd.to_numeric(df['ss'], errors='coerce')

df=df.sort(['yyyymmdd', 'hh','mm','ss'], ascending=[True, True,True,True])

Any help is appreciated.

like image 707
J Doe Avatar asked Feb 25 '17 23:02

J Doe


People also ask

How do you sort a timestamp in python?

To sort a Python date string list using the sort function, you'll have to convert the dates in objects and apply the sort on them. For this you can use the key named attribute of the sort function and provide it a lambda that creates a datetime object for each date and compares them based on this date object.

How do I sort datetime in pandas?

One thing to notice here is our DataFrame gets sorted in ascending order of dates, to sort the DataFrame in descending order we can pass an additional parameter inside the sort_values() function that will set ascending value to False and will return the DataFrame in descending order.

How do I compare timestamps in pandas?

Comparison between pandas timestamp objects is carried out using simple comparison operators: >, <,==,< = , >=. The difference can be calculated using a simple '–' operator. Given time can be converted to pandas timestamp using pandas. Timestamp() method.

Is timestamp the same as datetime pandas?

Timestamp is the pandas equivalent of python's Datetime and is interchangeable with it in most cases. It's the type used for the entries that make up a DatetimeIndex, and other timeseries oriented data structures in pandas.


1 Answers

You just have to ensure you denote the format specification properly, and you can use pd.to_datetime to convert them to actual datetimes before sort_values.

pd.to_datetime(stamps, format="%Y%m%d:%H:%M:%S.%f").sort_values()

This is much more direct than decomposing the timestamps in components and performing a multiple-criteria sort as you were attempting.

Demo

>>> stamps
0    20140804:10:00:13.281486
1    20140804:10:00:13.400113
2    20140804:10:00:13.555512
3    20140804:10:00:13.435677
dtype: object

>>> pd.to_datetime(stamps, format="%Y%m%d:%H:%M:%S.%f").sort_values()
0   2014-08-04 10:00:13.281486
1   2014-08-04 10:00:13.400113
3   2014-08-04 10:00:13.435677
2   2014-08-04 10:00:13.555512
dtype: datetime64[ns]
like image 74
miradulo Avatar answered Nov 12 '22 07:11

miradulo