I have a dataframe df like this :   
ID    NAME    AGE
-----------------
M43   ab      32
M32   df      12
M54   gh      34
M43   ab      98
M43   ab      36
M43   cd      32
M32   cd      39
M43   ab      67
I need to sort the rows based on the ID column.
The output df_grouped should look like :
ID    NAME    AGE
-----------------
M43   ab      32
M43   ab      98
M43   ab      36
M43   cd      32
M43   ab      67
M32   df      12
M32   cd      39
M54   gh      34
I tried something like :
df_grouped = df.group_by(df.ID)
for id in list(df.ID.unique()):
   grouped_df_list.append(df_grouped.get_group(id))
Is there any better way to do this ?
You can sort by multiple columns using pd.DataFrame.sort_values:
df = df.sort_values(['ID', 'NAME'])
By default, the argument ascending is set to True.
You can use pd.factorize to turn the key into a unique number which represents the order it appeared, then argsort that to get the positions to index into your frame, eg:
Given:
     0   1   2
0  M43  ab  32
1  M32  df  12
2  M54  gh  34
3  M43  ab  98
4  M43  ab  36
5  M43  cd  32
6  M32  cd  39
7  M43  ab  67
Then:
new_df = df.loc[pd.factorize(df[0])[0].argsort()]
# might want to consider df.reindex() instead depending...
You get:
     0   1   2
0  M43  ab  32
3  M43  ab  98
4  M43  ab  36
5  M43  cd  32
7  M43  ab  67
1  M32  df  12
6  M32  cd  39
2  M54  gh  34
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With