Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reset a DataFrame's indexes for all groups in one step?

I've tried to split my dataframe to groups

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',                        'foo', 'bar', 'foo', 'foo'],                    'B' : ['1', '2', '3', '4',                        '5', '6', '7', '8'],                    })  grouped = df.groupby('A') 

I get 2 groups

     A  B 0  foo  1 2  foo  3 4  foo  5 6  foo  7 7  foo  8       A  B 1  bar  2 3  bar  4 5  bar  6 

Now I want to reset indexes for each group separately

print grouped.get_group('foo').reset_index() print grouped.get_group('bar').reset_index() 

Finally I get the result

     A  B 0  foo  1 1  foo  3 2  foo  5 3  foo  7 4  foo  8       A  B 0  bar  2 1  bar  4 2  bar  6 

Is there better way how to do this? (For example: automatically call some method for each group)

like image 546
Meloun Avatar asked Mar 14 '14 14:03

Meloun


People also ask

How do I reset index in Groupby?

To reset index after group by, at first group according to a column using groupby(). After that, use reset_index().

How do I reset my MultiIndex?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.


2 Answers

Pass in as_index=False to the groupby, then you don't need to reset_index to make the groupby-d columns columns again:

In [11]: grouped = df.groupby('A', as_index=False)  In [12]: grouped.get_group('foo') Out[12]:      A  B 0  foo  1 2  foo  3 4  foo  5 6  foo  7 7  foo  8 

Note: As pointed out (and seen in the above example) the index above is not [0, 1, 2, ...], I claim that this will never matter in practice - if it does you're going to have to just through some strange hoops - it's going to be more verbose, less readable and less efficient...

like image 122
Andy Hayden Avatar answered Oct 02 '22 13:10

Andy Hayden


df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',                        'foo', 'bar', 'foo', 'foo'],                    'B' : ['1', '2', '3', '4',                        '5', '6', '7', '8'],                    }) grouped = df.groupby('A',as_index = False) 

we get two groups

grouped_index = grouped.apply(lambda x: x.reset_index(drop = True)).reset_index() 

Result in two new columns level_0 and level_1 getting added and the index is reset

  level_0level_1 A   B 0   0     0    bar  2 1   0     1    bar  4 2   0     2    bar  6 3   1     0    foo  1 4   1     1    foo  3 5   1     2    foo  5 6   1     3    foo  7 7   1     4    foo  8 
result = grouped_index.drop('level_0',axis = 1).set_index('level_1') 

Creates an index within each group of "A"

          A     B level_1      0        bar    2 1        bar    4 2        bar    6 0        foo    1 1        foo    3 2        foo    5 3        foo    7 4        foo    8 
like image 25
yogitha jaya reddy gari Avatar answered Oct 02 '22 11:10

yogitha jaya reddy gari