I have a multi-indexed pandas dataframe like the following one.
import numpy as np
import pandas as pd
arrays = [np.array(['bar', 'bar', 'bar', 'bar', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']),
np.array(['blo', 'bla', 'bla', 'blo', 'blo', 'blu', 'blo', 'bla'])]
df = pd.DataFrame(np.random.randn(8, 4), index=arrays)
df.sort_index(inplace=True)
which returns:
0 1 2 3
bar one bla 0.478461 1.030308 0.012688 0.137495
blo 0.476041 -1.679848 1.346798 0.143225
two bla 1.148882 -2.074197 -2.567959 1.258016
blo 1.062280 3.846096 -0.346636 1.170822
foo one blo -0.761327 0.262105 0.151554 1.066616
two blu 1.431951 0.043307 -0.326498 2.402536
qux one blo -0.622017 -0.566930 0.417977 -0.345238
two bla 0.129273 -0.181396 -0.758381 0.995827
Now I want to select a subset by using a slice object:
idx = pd.IndexSlice
subset = df.loc[idx[['bar'], :, :], :]
This returns:
0 1 2 3
bar one bla 0.478461 1.030308 0.012688 0.137495
blo 0.476041 -1.679848 1.346798 0.143225
two bla 1.148882 -2.074197 -2.567959 1.258016
blo 1.062280 3.846096 -0.346636 1.170822
Now I want to exclude all rows having "blo" as level value. I know that I could select everything but the 'blo' values but my real dataframe is very big and I only know the level values which should not appear in the subset.
What's the easiest way to exclude certain level values from the subset?
Thanks in advance!
IIUC, maybe you can mask your subset with:
subset = subset.iloc[subset.index.get_level_values(2) != 'blo']
You can do it this way:
In [263]:
subset.loc[subset.index.get_level_values(2) != 'blo']
Out[263]:
0 1 2 3
bar one bla -1.039335 -1.124656 0.057114 -0.284754
two bla 0.007208 -0.403559 -1.317075 -0.340171
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With