Group by index + column in pandas

Tags:

I have a dataframe that has two columns, user_id and item_bought. Here user_id is the index of the dataframe. I want to group by both user_id and item_bought and get the item wise count for the user.

How do I do that?

882

asked Jun 18 '15 20:06

vumaasha

2 Answers

From version 0.20.1 it is simplier:

Strings passed to DataFrame.groupby() as the by parameter may now reference either column names or index level names

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],           ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]  index = pd.MultiIndex.from_arrays(arrays, names=['first', 'second'])  df = pd.DataFrame({'A': [1, 1, 1, 1, 2, 2, 3, 3],                    'B': np.arange(8)}, index=index)  print (df)                A  B first second       bar   one     1  0       two     1  1 baz   one     1  2       two     1  3 foo   one     2  4       two     2  5 qux   one     3  6       two     3  7  print (df.groupby(['second', 'A']).sum())           B second A    one    1  2        2  4        3  6 two    1  4        2  5        3  7

149

answered Sep 25 '22 09:09

jezrael

this should work:

>>> df = pd.DataFrame(np.random.randint(0,5,(6, 2)), columns=['col1','col2']) >>> df['ind1'] = list('AAABCC') >>> df['ind2'] = range(6) >>> df.set_index(['ind1','ind2'], inplace=True) >>> df             col1  col2 ind1 ind2             A    0        3     2      1        2     0      2        2     3 B    3        2     4 C    4        3     1      5        0     0   >>> df.groupby([df.index.get_level_values(0),'col1']).count()             col2 ind1 col1       A    2        2      3        1 B    2        1 C    0        1      3        1

I had the same problem using one of the columns from multiindex. with multiindex, you cannot use df.index.levels[0] since it has only distinct values from that particular index level and will be most likely of different size than whole dataframe...

check http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.get_level_values.html - get_level_values "Return vector of label values for requested level, equal to the length of the index"

answered Sep 25 '22 09:09

kekert

Related questions
                            
                                Python ternary operator [duplicate]
                            
                                Can I access ImageMagick API with Python?
                            
                                How can I show figures separately in matplotlib?
                            
                                How to combine python asyncio with threads?
                            
                                What is the difference between !r and %r in Python?
                            
                                How to get top-level protobuf enum value name by number in python?
                            
                                get previous row's value and calculate new column pandas python
                            
                                `python -m unittest discover` does not discover tests
                            
                                Improving Python NetworkX graph layout
                            
                                How do I programmatically set the docstring?
                            
                                How can I get affected row count from psycopg2 connection.commit()?
                            
                                How to Fix Python Nose: Coverage not available: unable to import coverage module
                            
                                Python getattr equivalent for dictionaries?
                            
                                How to iterate over Pandas Series generated from groupby().size()
                            
                                Expire a view-cache in Django?
                            
                                Sort array's rows by another array in Python
                            
                                How to print +1 in Python, as +1 (with plus sign) instead of 1?
                            
                                AttributeError: 'module' object (scipy) has no attribute 'misc'
                            
                                Understanding the set() function
                            
                                python strptime format with optional bits

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Group by index + column in pandas

Tags:

python

pandas

vumaasha

People also ask

2 Answers

jezrael

kekert

Recent Activity

Donate For Us