Let's say I have the following multi-index DataFrame:
import pandas as pd
df = pd.DataFrame({'Index0':[0,1,2,3,4,5],'Index1':[100,200,300,400,500,600],'A':[5,2,5,8,1,2]})
Now I want to select all the rows where Index1 is less than 400. Everybody knows how that works if Index1 was a regular column:
df[df['Index1'] < 400]
So one method would be to reset_index
, perform the selection, then set the index again. This seems quite redundant.
My question is: Is there a way to do this directly? And how to do this when the DataFrame has a row multiindex?
Simpliest here is use query
:
df1 = df.query('Index1 < 400')
print (df1)
A
Index0 Index1
0 100 5
1 200 2
2 300 5
Or get_level_values
for select level of MultiIndex
with boolean indexing
:
df1 = df[df.index.get_level_values('Index1') < 400]
Detail:
print (df.index.get_level_values('Index1'))
Int64Index([100, 200, 300, 400, 500, 600], dtype='int64', name='Index1')
If levels have no names select by positions, for query use special keyword ilevel_
with position:
df.index.names = [None, None]
print (df)
A
0 100 5
1 200 2
2 300 5
3 400 8
4 500 1
5 600 2
df1 = df.query('ilevel_1 < 400')
df1 = df[df.index.get_level_values(1) < 400]
print (df1)
A
0 100 5
1 200 2
2 300 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With