Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas sort dataframe by enum class values

If I have enum class:

from enum import Enum
class Colors(Enum):
    RED = 1
    ORANGE = 2
    GREEN = 3

And if I have a dataframe whose one column is color (it can be in lowercase to):

>>> import pandas as pd
>>> df = pd.DataFrame({'X':['A', 'B', 'C', 'A'], 'color' : ['GREEN', 'RED', 'ORANGE', 'ORANGE']})
>>> df
   X   color
0  A   GREEN
1  B     RED
2  C  ORANGE
3  A  ORANGE

How to make color column as categorical type respecting Color class values, and sort the dataframe by "color" and "X" (ascending)?

For example, the dataframe above should be sorted as:

X, color
--------
B, RED
A, ORANGE
C, ORANGE
A, GREEN
like image 278
user3225309 Avatar asked Dec 14 '22 08:12

user3225309


1 Answers

Combination of this answer and this one: use a pd.Categorical to sort by the Colors class (with a slight edit to change its str):

from enum import Enum
import pandas as pd

df = pd.DataFrame({'X':['A', 'B', 'C', 'A'], 'color' : ['GREEN', 'RED', 'ORANGE', 'ORANGE']})

class Colors(Enum):
    RED = 1
    ORANGE = 2
    GREEN = 3
    def __str__(self):
        return self.name

df['color'] = pd.Categorical(df['color'], [str(i) for i in Colors], ordered=True)
df = df.sort_values(['color','X'])

Result:

   X   color
1  B     RED
3  A  ORANGE
2  C  ORANGE
0  A   GREEN
like image 154
Tom Avatar answered Dec 28 '22 02:12

Tom