I have the following dataframe let’s say:
df
A B C D E
z k s 7 d
z k s 6 l
x t r 2 e
x t r 1 x
u c r 8 f
u c r 9 h
y t s 5 l
y t s 2 o
And I would like to sort it based on col D for each sub row (that has for example same cols A,B and C in this case)
The expected output would be:
df
A B C D E
z k s 6 l
z k s 7 d
x t r 1 x
x t r 2 e
u c r 8 f
u c r 9 h
y t s 2 o
y t s 5 l
Any help for this kind of operation?
To sort the DataFrame based on the values in a single column, you'll use . sort_values() . By default, this will return a new DataFrame sorted in ascending order.
You can sort by column values in pandas DataFrame using sort_values() method. To specify the order, you have to use ascending boolean property; False for descending and True for ascending. By default, it is set to True.
To group Pandas dataframe, we use groupby(). To sort grouped dataframe in ascending or descending order, use sort_values(). The size() method is used to get the dataframe size.
I think it should be as simple as this:
df = df.sort_values(["A", "B", "C", "D"])
You can use groupby and sort values (also credit to @Henry Ecker for his comment):
df.groupby(['A','B','C'],group_keys=False,sort=False).apply(pd.DataFrame.sort_values,'D')
output:
A B C D E
1 z k s 6 l
0 z k s 7 d
3 x t r 1 x
2 x t r 2 e
4 u c r 8 f
5 u c r 9 h
7 y t s 2 o
6 y t s 5 l
Let us try ngroup
create the help col
df['new1'] = df.groupby(['A','B','C'],sort=False).ngroup()
df = df.sort_values(['new1','D']).drop('new1',axis=1)
df
A B C D E
1 z k s 6 l
0 z k s 7 d
3 x t r 1 x
2 x t r 2 e
4 u c r 8 f
5 u c r 9 h
7 y t s 2 o
6 y t s 5 l
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With