Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move pandas data from index to column after multiple groupby

I have the following pandas dataframe:

token    year    uses  books   386   xanthos  1830    3     3   387   xanthos  1840    1     1   388   xanthos  1840    2     2   389   xanthos  1868    2     2   390   xanthos  1875    1     1 

I aggregate the rows with duplicate token and years like so:

dfalph = dfalph[['token','year','uses','books']].groupby(['token', 'year']).agg([np.sum]) dfalph.columns = dfalph.columns.droplevel(1)                 uses  books token    year        xanthos  1830    3     3          1840    3     3          1867    2     2          1868    2     2          1875    1     1 

Instead of having the 'token' and 'year' fields in the index, I would like to return them to columns and have an integer index.

like image 990
prooffreader Avatar asked Feb 13 '14 23:02

prooffreader


People also ask

How do I change the index after Groupby pandas?

To reset index after group by, at first group according to a column using groupby(). After that, use reset_index().

How do you get index after Groupby pandas?

Overview: Create a dataframe using an dictionary. Group by item_id and find the max value. enumerate over the grouped dataframe and use the key which is an numeric value to return the alpha index value. Create an result_df dataframe if you desire.

How do I convert multiple index to columns in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

Does pandas Groupby preserve index?

The Groupby Rolling function does not preserve the original index and so when dates are the same within the Group, it is impossible to know which index value it pertains to from the original dataframe.


1 Answers

Method #1: reset_index()

>>> g               uses  books                sum    sum token   year              xanthos 1830     3      3         1840     3      3         1868     2      2         1875     1      1  [4 rows x 2 columns] >>> g = g.reset_index() >>> g      token  year  uses  books                    sum    sum 0  xanthos  1830     3      3 1  xanthos  1840     3      3 2  xanthos  1868     2      2 3  xanthos  1875     1      1  [4 rows x 4 columns] 

Method #2: don't make the index in the first place, using as_index=False

>>> g = dfalph[['token', 'year', 'uses', 'books']].groupby(['token', 'year'], as_index=False).sum() >>> g      token  year  uses  books 0  xanthos  1830     3      3 1  xanthos  1840     3      3 2  xanthos  1868     2      2 3  xanthos  1875     1      1  [4 rows x 4 columns] 
like image 107
DSM Avatar answered Oct 02 '22 11:10

DSM