Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add rows for all missing values of one multi-index's level?

Tags:

python

pandas

Suppose that I have the following dataframe df, indexed by a 3-level multi-index:

In [52]: df
Out[52]: 
          C
L0 L1 L2   
0  w  P   1
   y  P   2
      R   3
1  x  Q   4
      R   5
   z  S   6

Code to create the DataFrame:

idx = pd.MultiIndex(levels=[[0, 1], ['w', 'x', 'y', 'z'], ['P', 'Q', 'R', 'S']],
                    labels=[[0, 0, 0, 1, 1, 1], [0, 2, 2, 1, 1, 3], [0, 0, 2, 1, 2, 3]],
                    names=['L0', 'L1', 'L2'])

df = pd.DataFrame({'C': [1, 2, 3, 4, 5, 6]}, index=idx)

The possible values for the L2 level are 'P', 'Q', 'R', and 'S', but some of these values are missing for particular combinations of values for the remaining levels. For example, the combination (L0=0, L1='w', L2='Q') is not present in df.

I would like to add enough rows to df so that, for each combination of values for the levels other than L2, there is exactly one row for each of the L2 level's possible values. For the added rows, the value of the C column should be 0.

IOW, I want to expand df so that it looks like this:

          C
L0 L1 L2     
0  w  P   1
      Q   0
      R   0
      S   0
   y  P   2
      Q   0
      R   3
      S   0
1  x  P   0
      Q   4
      R   5
      S   0
   z  P   0
      Q   0
      R   0
      S   6

REQUIREMENTS:

  • the operation should leave the types of the columns unchanged;
  • the operation should add the smallest number of rows needed to complete only the specified level (L2)

Is there a simple way to perform this expansion?

like image 849
kjo Avatar asked Jan 30 '23 02:01

kjo


1 Answers

Suppose L2 initially contains all the possible values you need, you can use unstack.stack trick:

df.unstack('L2', fill_value=0).stack(level=1)

enter image description here

like image 176
Psidom Avatar answered Feb 02 '23 08:02

Psidom