Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I return multiple levels/groups of values from a multi-index dataframe?

Here is my multi-index dataframe:

# Index Levels
outside = ['G1','G1','G1','G2','G2','G2']
inside = [1,2,3,1,2,3]
hier_index = list(zip(outside,inside))
hier_index = pd.MultiIndex.from_tuples(hier_index)
df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])
df.index.names = ['Group','Num']
df

The dataframe looks like this:

                  A           B
Group   Num     
G1      1     0.147027  -0.479448
        2     0.558769   1.024810
        3    -0.925874   1.862864
G2      1    -1.133817   0.610478
        2     0.386030   2.084019
        3    -0.376519   0.230336

What I want to achieve is to return the values in Group G1 and G2, Num 1 and 3, which looks like this:

G1     1     0.147027   -0.479448
       3    -0.925874    1.862864
G2     1    -1.133817    0.610478
       3    -0.376519    0.230336

I've tried

df.loc[['G1','G2']].loc[[1,3]]

but it shows nothing.

Then I tried

df.xs([['G1','G2'],[1,3]]) 

but it returns

TypeError: '(['G1', 'G2'], [1, 3])' is an invalid key.

Is there any way I can just make it return the values in Group G1 and G2, Num 1 and 3?

like image 835
Novus Avatar asked Dec 17 '22 15:12

Novus


1 Answers

Use DataFrame.loc with lists:

df1 = df.loc[(['G1','G2'], [1,3]), :]
print (df1)
                  A         B
Group Num                    
G1    1    2.165594  0.466762
      3    0.451996  0.125071
G2    1    2.783947  0.176145
      3    0.169508  0.071441

Or use slicers:

idx = pd.IndexSlice
df1 = df.loc[idx[['G1','G2'], [1,3]], :]
print (df1)
                  A         B
Group Num                    
G1    1    0.617367 -1.010116
      3   -0.990257 -1.262942
G2    1    1.336134 -0.198787
      3   -0.310426  1.063520
like image 192
jezrael Avatar answered Dec 28 '22 10:12

jezrael