Sort pandas df subset of rows (within a group) by specific column

Tags:

I have the following dataframe let’s say:


A B C D E
z k s 7 d
z k s 6 l
x t r 2 e
x t r 1 x
u c r 8 f
u c r 9 h
y t s 5 l
y t s 2 o

And I would like to sort it based on col D for each sub row (that has for example same cols A,B and C in this case)

The expected output would be:


A B C D E
z k s 6 l
z k s 7 d
x t r 1 x
x t r 2 e
u c r 8 f
u c r 9 h
y t s 2 o
y t s 5 l

Any help for this kind of operation?

689

asked Jun 05 '21 00:06

Salvatore Nedia

3 Answers

I think it should be as simple as this:

df = df.sort_values(["A", "B", "C", "D"])

198

answered Sep 27 '22 23:09

saedx1

You can use groupby and sort values (also credit to @Henry Ecker for his comment):

df.groupby(['A','B','C'],group_keys=False,sort=False).apply(pd.DataFrame.sort_values,'D')

output:

    A   B   C   D   E
1   z   k   s   6   l
0   z   k   s   7   d
3   x   t   r   1   x
2   x   t   r   2   e
4   u   c   r   8   f
5   u   c   r   9   h
7   y   t   s   2   o
6   y   t   s   5   l

answered Sep 27 '22 23:09

Ehsan

Let us try ngroup create the help col

df['new1'] = df.groupby(['A','B','C'],sort=False).ngroup()
df = df.sort_values(['new1','D']).drop('new1',axis=1)
df
   A  B  C  D  E
1  z  k  s  6  l
0  z  k  s  7  d
3  x  t  r  1  x
2  x  t  r  2  e
4  u  c  r  8  f
5  u  c  r  9  h
7  y  t  s  2  o
6  y  t  s  5  l

answered Sep 27 '22 22:09

BENY

Related questions
                            
                                Why do I get CUDA out of memory when running PyTorch model [with enough GPU memory]?
                            
                                Plotly: How to set a fill color between two vertical lines?
                            
                                Celery task hangs after calling .delay() in Django
                            
                                Using PyTorch with Celery
                            
                                Windows notification with button using python
                            
                                Why does matrix multiplication with sparse matrices differ from dense ones if Inf is involved?
                            
                                Pandas Split column into multiple columns by multiple string delimiters
                            
                                Python pandas: how to obtain the datatypes of objects in a mixed-datatype column?
                            
                                tf.keras model.predict results in memory leak
                            
                                How to find the version of jupyter notebook from within the notebook
                            
                                How to run Microsoft Edge headless with Selenium Python?
                            
                                How can I publish Python packages to CodeArtifact using Poetry?
                            
                                Could not find module (or one of its dependencies). Try using the full path with constructor syntax
                            
                                Set all markers to the same fixed size in Plotly Express scatterplot
                            
                                Calling class method inside string format
                            
                                Is there a way to get the number of occurrences of the last value in a groupby?
                            
                                Sorting by data in another Dataframe
                            
                                textcat -> architecture extra fields not permitted
                            
                                How to programmatically ensure that a function includes a return statement?
                            
                                How to extract country from a string in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Sort pandas df subset of rows (within a group) by specific column

Tags:

python

pandas

dataframe

numpy