pandas groupby will by default sort. But I'd like to change the sort order. How can I do this?
I'm guessing that I can't apply a sort method to the returned groupby object.
To group Pandas dataframe, we use groupby(). To sort grouped dataframe in descending order, use sort_values(). The size() method is used to get the dataframe size.
To group Pandas dataframe, we use groupby(). To sort grouped dataframe in ascending or descending order, use sort_values(). The size() method is used to get the dataframe size.
Groupby preserves the order of rows within each group. When calling apply, add group keys to index to identify pieces. Reduce the dimensionality of the return type if possible, otherwise return a consistent type.
You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending boolean property; False for descending and True for ascending. By default, it is set to True.
Do your groupby, and use reset_index() to make it back into a DataFrame. Then sort.
grouped = df.groupby('mygroups').sum().reset_index()
grouped.sort_values('mygroups', ascending=False)
As of Pandas 0.18 one way to do this is to use the sort_index
method of the grouped data.
Here's an example:
np.random.seed(1)
n=10
df = pd.DataFrame({'mygroups' : np.random.choice(['dogs','cats','cows','chickens'], size=n),
'data' : np.random.randint(1000, size=n)})
grouped = df.groupby('mygroups', sort=False).sum()
grouped.sort_index(ascending=False)
print grouped
data
mygroups
dogs 1831
chickens 1446
cats 933
As you can see, the groupby column is sorted descending now, indstead of the default which is ascending.
Similar to one of the answers above, but try adding .sort_values()
to your .groupby()
will allow you to change the sort order. If you need to sort on a single column, it would look like this:
df.groupby('group')['id'].count().sort_values(ascending=False)
ascending=False
will sort from high to low, the default is to sort from low to high.
*Careful with some of these aggregations. For example .size() and .count() return different values since .size() counts NaNs.
What is the difference between size and count in pandas?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With