Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reindex specific level of pandas MultiIndex

I got a DataFrame with 3 levels of Index and I need to reindex the third level without changing the first and the second level.

I have a DataFrame like this:

tuples = [('A', 'a', 1), ('A', 'a', 3), ('A', 'b', 3), ('B', 'c', 1), ('B', 'c', 2), ('B', 'c', 3), ('C', 'd', 2)]

idx = pd.MultiIndex.from_tuples(tuples, names=['first', 'second', 'third'])

df = pd.DataFrame(np.random.randn(7, 2), index=idx, columns=['col1', 'col2'])

                        col1      col2
first second third                    
A     a      1     -0.999816 -0.599815
             3     -0.277794 -0.453870
      b      3      1.116561  0.760010
B     c      1      1.018475 -0.667625
             2      0.695997  0.641531
             3      0.593724  0.265256
C     d      2      1.133767  0.716083

And I would like a DataFrame like this:

                        col1      col2
first second third                    
A     a      1     -0.999816 -0.599815
             2      0         0
             3     -0.277794 -0.453870
      b      1      0         0
             2      0         0
             3      1.116561  0.760010
B     c      1      1.018475 -0.667625
             2      0.695997  0.641531
             3      0.593724  0.265256
C     d      1      0         0
             2      1.133767  0.716083
             3      0         0

I want the third index to be the same everywhere

like image 286
Jazz Avatar asked Jul 09 '19 13:07

Jazz


People also ask

How do I convert MultiIndex to single index in Pandas?

Output: Now, the dataframe has Hierarchical Indexing or multi-indexing. To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.

How do I reindex in Pandas?

One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.


1 Answers

Use DataFrame.unstack working by default by last index of MultiIndex with DataFrame.stack:

df1 = df.unstack(fill_value=0).stack()
print (df1)
                        col1      col2
first second third                    
A     a      1     -1.549363 -1.206828
             2      0.000000  0.000000
             3      0.445008 -0.173086
      b      1      0.000000  0.000000
             2      0.000000  0.000000
             3      1.488947 -0.792520
B     c      1      1.838997 -0.439362
             2      1.160003 -0.577093
             3     -1.031044 -0.838885
C     d      1      0.000000  0.000000
             2      0.316934  0.353254
             3      0.000000  0.000000
like image 145
jezrael Avatar answered Oct 03 '22 07:10

jezrael