I have a column in a dataframe which contains list data
for example:
enter code here
| Column 1 | Column 2 |
|--------- |---------------------------|
| 1 | ['jan'] |
| 2 | ['jan','feb'] |
| 3 | ['feb','jan'] |
| 4 | ['mar','april','jan'] |
| 5 | ['jan','mar','april'] |
| 6 | ['jan','feb','mar'] |
| 7 | ['jan','feb','mar'] |
As in the above table we can see column 2, row 2 and row 3 have same list with different order. same is the case with other values too. like (row 4 and row 5)
I have to sort in this order
enter code here ['oct' ,'nov' ,'dec' ,'jan' ,'feb' ,'march' ,'april']
The desired output should be like below, a new column 3 with sorted values
enter code here
| Column 1 | Column 2 | Column 3 |
|--------- |---------------------------|---------------------------|
| 1 | ['oct'] | ['oct'] |
| 2 | ['oct','nov'] | ['oct','nov'] |
| 3 | ['nov','oct'] | ['oct','nov'] |
| 4 | ['mar','april','jan'] | ['jan','mar','april'] |
| 5 | ['jan','mar','april'] | ['jan','mar','april'] |
| 6 | ['jan','feb','mar'] | ['jan','feb','mar'] |
| 7 | ['jan','feb','mar'] | ['jan','feb','mar'] |
Use sorted by create dictionary with enumerate:
#import ast
#converted to lists if necessary
df['Column 2'] = df['Column 2'].apply(ast.literal_eval)
months = ['oct' ,'nov' ,'dec' ,'jan' ,'feb' ,'mar' ,'april']
d = {v: k for k, v in enumerate(months)}
df['Column 2'] = df['Column 2'].apply(lambda x: sorted(x, key=lambda y: d[y]))
print (df)
Column 1 Column 2
0 1 [jan]
1 2 [jan, feb]
2 3 [jan, feb]
3 4 [jan, mar, april]
4 5 [jan, mar, april]
5 6 [jan, feb, mar]
6 7 [jan, feb, mar]
If values are first 3 letters is possible sorted by datetimes:
df['Column 2'] = df['Column 2'].apply(lambda x: sorted(x, key=lambda y: pd.to_datetime(y, format='%b')))
print (df)
Column 1 Column 2
0 1 [jan]
1 2 [jan, feb]
2 3 [jan, feb]
3 4 [jan, mar, apr]
4 5 [jan, mar, apr]
5 6 [jan, feb, mar]
6 7 [jan, feb, mar]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With