Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any way to sort the dataframe column which contain lists?

I have a column in a dataframe which contains list data

for example:

enter code here
| Column 1 | Column 2                  |
|--------- |---------------------------|
|    1     | ['jan']                   |
|    2     | ['jan','feb']             |
|    3     | ['feb','jan']             |
|    4     | ['mar','april','jan']     |
|    5     | ['jan','mar','april']     |
|    6     | ['jan','feb','mar']       |
|    7     | ['jan','feb','mar']       |

As in the above table we can see column 2, row 2 and row 3 have same list with different order. same is the case with other values too. like (row 4 and row 5)

I have to sort in this order

enter code here ['oct' ,'nov' ,'dec' ,'jan' ,'feb' ,'march' ,'april']

The desired output should be like below, a new column 3 with sorted values

enter code here
| Column 1 | Column 2                  |      Column 3             |
|--------- |---------------------------|---------------------------|
|    1     | ['oct']                   | ['oct']                   |
|    2     | ['oct','nov']             | ['oct','nov']             |
|    3     | ['nov','oct']             | ['oct','nov']             |
|    4     | ['mar','april','jan']     | ['jan','mar','april']     |
|    5     | ['jan','mar','april']     | ['jan','mar','april']     |
|    6     | ['jan','feb','mar']       | ['jan','feb','mar']       |
|    7     | ['jan','feb','mar']       | ['jan','feb','mar']       |
like image 871
Mr_question Avatar asked Nov 19 '25 19:11

Mr_question


1 Answers

Use sorted by create dictionary with enumerate:

#import ast

#converted to lists if necessary
df['Column 2'] = df['Column 2'].apply(ast.literal_eval)

months = ['oct' ,'nov' ,'dec' ,'jan' ,'feb' ,'mar' ,'april']

d = {v: k for k, v in enumerate(months)}
   
df['Column 2'] = df['Column 2'].apply(lambda x: sorted(x, key=lambda y: d[y]))
print (df)
   Column 1           Column 2
0         1              [jan]
1         2         [jan, feb]
2         3         [jan, feb]
3         4  [jan, mar, april]
4         5  [jan, mar, april]
5         6    [jan, feb, mar]
6         7    [jan, feb, mar]

If values are first 3 letters is possible sorted by datetimes:

df['Column 2'] = df['Column 2'].apply(lambda x: sorted(x, key=lambda y: pd.to_datetime(y, format='%b')))
print (df)
   Column 1         Column 2
0         1            [jan]
1         2       [jan, feb]
2         3       [jan, feb]
3         4  [jan, mar, apr]
4         5  [jan, mar, apr]
5         6  [jan, feb, mar]
6         7  [jan, feb, mar]
like image 97
jezrael Avatar answered Nov 22 '25 09:11

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!