Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe sort by date

I made a dataframe by importing a csv file. And converted the date column to datetime and made it the index. However, when sorting the index it doesn't produce the result I wanted

print(df.head())
df['Date'] = pd.to_datetime(df['Date'])
df.index = df['Date']
del df['Date']
df.sort_index()
print(df.head())

Here's the result:

         Date     Last
0  2016-12-30  1.05550
1  2016-12-29  1.05275
2  2016-12-28  1.04610
3  2016-12-27  1.05015
4  2016-12-23  1.05005
               Last
Date               
2016-12-30  1.05550
2016-12-29  1.05275
2016-12-28  1.04610
2016-12-27  1.05015
2016-12-23  1.05005

The date actually goes back to 1999, so if I sort this by date, it should show the data in ascending order right?

like image 936
ajax2000 Avatar asked Jan 02 '17 21:01

ajax2000


People also ask

How do I sort dates in pandas DataFrame?

One thing to notice here is our DataFrame gets sorted in ascending order of dates, to sort the DataFrame in descending order we can pass an additional parameter inside the sort_values() function that will set ascending value to False and will return the DataFrame in descending order.

How do you sort data by date in python?

To sort a Python date string list using the sort function, you'll have to convert the dates in objects and apply the sort on them. For this you can use the key named attribute of the sort function and provide it a lambda that creates a datetime object for each date and compares them based on this date object.

How do I sort dataset by date?

Here's how to sort unsorted dates: Drag down the column to select the dates you want to sort. Click Home tab > arrow under Sort & Filter, and then click Sort Oldest to Newest, or Sort Newest to Oldest.


1 Answers

Just expanding MaxU's correct answer: you have used correct method, but, just as with many other pandas methods, you will have to "recreate" dataframe in order for desired changes to take effect. As the MaxU already suggested, this is achieved by typing the variable again (to "store" the output of the used method into the same variable), e.g.:

df = df.sort_index()

or by harnessing the power of attribute inplace=True, which is going to replace the content of the variable without need of redeclaring it.

df.sort_index(inplace=True)

However, in my experience, I often feel "safer" using the first option. It also looks clearer and more normalized, since not all the methods offer the inplace usage. But I all comes down to scripting sytle I guess...

like image 73
Marjan Moderc Avatar answered Sep 18 '22 08:09

Marjan Moderc