Im having a hard time using Pandas groupby. Say I have the following:
df2 = pd.DataFrame({'X' : ['B', 'B', 'A', 'A', 'C'], 'Y' : [1, 2, 3, 4, 5]})
I want to do a groupby operation to get group all A's together and all not A's together, so something like this:
df2.groupby(<something>).groups
Out[1]: {'A': [2, 3], 'not A': [0, 1, 4]}
I've tried things like sending in a function but couldn't get anything to work. Is this possible?
Thanks a lot.
In [3]: df2.groupby(df2['X'] == 'A').groups
Out[3]: {False: [0, 1, 4], True: [2, 3]}
to expand @Dan Allan answer a bit - if you want to name your groups, you can use numpy.where() to create mapping array:
>>> df2 = pd.DataFrame({'X' : ['B', 'B', 'A', 'A', 'C'], 'Y' : [1, 2, 3, 4, 5]})
>>> m = np.where(df2['X'] == 'A', 'A', 'not A')
>>> df2.groupby(m).groups
{'A': [2, 3], 'not A': [0, 1, 4]}
To check if df2['X'] is either A or B, you can use df2['X'].isin(['A', 'B'])
instead of df2['X'] == 'A'
, or more clumsy np.logical_or(df2['X'] == 'A', df2['X'] == 'B')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With