Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rollup of multiple string columns by id - python

Tags:

python

pandas

I got following dataframe in Python:

d = pd.DataFrame({'id': [1, 1, 1, 2, 2, 3],
              'col1': ['normal', 'well', 'normal', 'normal', 'well', 'normal'],
              'col2': ['bad', 'normal','normal', 'normal', 'normal', 'bad']})

I would like to rollup by id but keep in columns strings that are other than 'normal' or 'normal' if there is nothing else ('well' or 'bad'). Something like following:

result = pd.DataFrame({'id': [1, 2, 3],
                'col1': ['well', 'well', 'normal'],
                'col2': ['bad', 'normal', 'bad']})

I was thinking about sorting and then using groupby and .first but not sure how to get desired levels on the top in each column.

like image 396
John Snow Avatar asked Feb 10 '26 08:02

John Snow


2 Answers

Use Categorical to define order

cats = ['well', 'bad', 'normal']
d = d.assign(
    col1=pd.Categorical(d.col1, cats, ordered=True),
    col2=pd.Categorical(d.col2, cats, ordered=True)
)

d.groupby('id', as_index=False).min()

   id    col1    col2
0   1    well     bad
1   2    well  normal
2   3  normal     bad
like image 120
piRSquared Avatar answered Feb 13 '26 00:02

piRSquared


Use replace by NaNs first if no NaNs values before GroupBy.first:

d = d.replace('normal', np.nan).groupby('id').first().fillna('normal')
#alternative solution
d = d.mask(d == 'normal').groupby('id').first().fillna('normal')

print (d)
      col1    col2
id                
1     well     bad
2     well  normal
3   normal     bad
like image 38
jezrael Avatar answered Feb 13 '26 00:02

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!