Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filtering a mutli-index

C1 C2 C3 C4
A 12 True 89
9 False 77
5 True 23
B 9 True 45
5 True 45
2 False 78
C 11 True 10
8 False 08
12 False 09

C1 & C2 are the multi index. I'm hoping to get a result which gives me only values in C1 which have values both lower than 10 and greater than or equal to 10 in C2.

So in the table above C1 - B should go, with the final result should look like this:

C1 C2 C3 C4
A 12 True 89
9 False 77
5 True 23
C 11 True 10
8 False 08
12 False 09

I tried df.loc[(df.C2 < 10 ) & (df.C2 >= 10)] but this didn't work.

I also tried:

filter1 = df.index.get_level_values('C2') < 10 filter2 = df.index.get_level_values('C2') >= 10

df.iloc[filter1 & filter2]

Which I saw suggested on another post that also didn't work. Any one know how to solve this? Thanks

like image 643
Wagsforever Avatar asked Dec 14 '21 10:12

Wagsforever


People also ask

How do you slice multiple index in pandas?

You can slice a MultiIndex by providing multiple indexers. You can provide any of the selectors as if you are indexing by label, see Selection by Label, including slices, lists of labels, labels, and boolean indexers. You can use slice(None) to select all the contents of that level.


1 Answers

Use GroupBy.transform with GroupBy.any for test at least one condition match per groups, so possible last filter by m DataFrame:

filter1 = df.index.get_level_values('C2') < 10 
filter2 = df.index.get_level_values('C2') >= 10

m = (df.assign(filter1= filter1, filter2=filter2)
       .groupby(level=0)[['filter1','filter2']]
       .transform('any'))

print (m)
       filter1  filter2
C1 C2                  
A  12     True     True
   9      True     True
   5      True     True
B  9      True    False
   5      True    False
   2      True    False
C  11     True     True
   8      True     True
   12     True     True

df = df[m.filter1 & m.filter2]
print (df)
          C3  C4
C1 C2           
A  12   True  89
   9   False  77
   5    True  23
C  11   True  10
   8   False   8
   12  False   9

Alternative solution:

filter1 = df.index.get_level_values('C2') < 10 
filter2 = df.index.get_level_values('C2') >= 10

lvl1 = df.index[filter1].remove_unused_levels().levels[0]
lvl2 = df.index[filter2].remove_unused_levels().levels[0]

df1 = df.loc[set(lvl1).intersection(lvl2)]
print (df1)
          C3  C4
C1 C2           
A  12   True  89
   9   False  77
   5    True  23
C  11   True  10
   8   False   8
   12  False   9
like image 153
jezrael Avatar answered Nov 03 '22 14:11

jezrael