I have the following DataFrame that uses a three-level MultiIndex:
In [1]: iterables = [[1, 2], ['foo', 'bar'], ['one', 'two']]
...: midx = pd.MultiIndex.from_product(iterables)
...: df = pd.DataFrame(np.random.randn(8), index=midx)
...: df
Out[1]:
0
1 foo one -0.217594
two -1.361612
bar one 2.477790
two 0.874409
2 foo one 0.403577
two 0.076111
bar one 1.423512
two 0.047898
I'd like to slice the index such that I keep all of the first level while only keeping the following combinations of the second two levels: ('foo', 'one')
and ('bar', 'two')
. That is, I'd like my output to look something like this:
0
1 foo one -0.217594
bar two 0.874409
2 foo one 0.403577
bar two 0.047898
Is it possible to do this in one line, using an attribute such as .loc
, for example?
I know that I can take cross-sections of the desired combinations separately using the .xs
function, but I'd prefer a shorter, more slice-like syntax. Specifically, having a one-liner is important for my use case.
It seems like the following should work:
df.loc[[(slice(None), 'foo', 'one'), (slice(None), 'bar', 'two')]]
But this results in a TypeError: unhashable type: 'slice'
.
2. pandas Get Unique Values in Column. Unique is also referred to as distinct, you can get unique values in the column using pandas Series. unique() function, since this function needs to call on the Series object, use df['column_name'] to get the unique values as a Series.
You can construct a Boolean mask by first dropping the first index level and then using pd.Index.isin
with a list of tuples:
df_masked = df[df.index.droplevel(0).isin([('foo', 'one'), ('bar', 'two')])]
print(df_masked)
0
1 foo one 1.510316
bar two 0.260862
2 foo one 0.813745
bar two 0.023386
Here's a solution based on DataFrame.query
. Your MultiIndex has unnamed levels, but level k
can be accessed with the special name ilevel_k
, according to the docs: "If the levels of the MultiIndex
are unnamed, you can refer to them using special names."
query_string = ('(ilevel_1 == "foo" & ilevel_2 == "one") | '
'(ilevel_1 == "bar" & ilevel_2 == "two")')
df.query(query_string)
0
1 foo one -0.217594
bar two 0.874409
2 foo one 0.403577
bar two 0.047898
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With