I have a dataframe df
like this :
ID NAME AGE
-----------------
M43 ab 32
M32 df 12
M54 gh 34
M43 ab 98
M43 ab 36
M43 cd 32
M32 cd 39
M43 ab 67
I need to sort the rows based on the ID
column.
The output df_grouped
should look like :
ID NAME AGE
-----------------
M43 ab 32
M43 ab 98
M43 ab 36
M43 cd 32
M43 ab 67
M32 df 12
M32 cd 39
M54 gh 34
I tried something like :
df_grouped = df.group_by(df.ID)
for id in list(df.ID.unique()):
grouped_df_list.append(df_grouped.get_group(id))
Is there any better way to do this ?
You can sort by multiple columns using pd.DataFrame.sort_values
:
df = df.sort_values(['ID', 'NAME'])
By default, the argument ascending
is set to True
.
You can use pd.factorize
to turn the key into a unique number which represents the order it appeared, then argsort that to get the positions to index into your frame, eg:
Given:
0 1 2
0 M43 ab 32
1 M32 df 12
2 M54 gh 34
3 M43 ab 98
4 M43 ab 36
5 M43 cd 32
6 M32 cd 39
7 M43 ab 67
Then:
new_df = df.loc[pd.factorize(df[0])[0].argsort()]
# might want to consider df.reindex() instead depending...
You get:
0 1 2
0 M43 ab 32
3 M43 ab 98
4 M43 ab 36
5 M43 cd 32
7 M43 ab 67
1 M32 df 12
6 M32 cd 39
2 M54 gh 34
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With