Pandas dataframe with MultiIndex: check if string is contained in index level

Let's say I have a multi-indexed pandas dataframe that looks like the following one, taken from the documentation.

import numpy as np
import pandas as pd

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
          np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]

df = pd.DataFrame(np.random.randn(8, 4), index=arrays)

Which looks like this:

                0         1         2         3
bar one -0.096648 -0.080298  0.859359 -0.030288
    two  0.043107 -0.431791  1.923893 -1.544845
baz one  0.639951 -0.008833 -0.227000  0.042315
    two  0.705281  0.446257 -1.108522  0.471676
foo one -0.579483 -2.261138 -0.826789  1.543524
    two -0.358526  1.416211  1.589617  0.284130
qux one  0.498149 -0.296404  0.127512 -0.224526
    two -0.286687 -0.040473  1.443701  1.025008

Now I only want the rows where "ne" is contained in second level of the MultiIndex.

Is there any way to slice the MultiIndex for (partly) contained strings?

How do you check if a value is in a pandas index?

To check if a value exists in the Index of a Pandas DataFrame, use the in keyword on the index property.

What does the pandas function MultiIndex From_tuples do?

from_tuples() function is used to convert list of tuples to MultiIndex. It is one of the several ways in which we construct a MultiIndex.

How do you check if a string contains a substring in pandas?

Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .

You can apply a mask like:

df = df.iloc[df.index.get_level_values(1).str.contains('ne')]

which returns:

bar one -0.143200  0.523617  0.376458 -2.091154
baz one -0.198220  1.234587 -0.232862 -0.510039
foo one -0.426127  0.594426  0.457331 -0.459682
qux one -0.875160 -0.157073 -0.540459 -1.792235

EDIT: It is possible also applying a logical mask on multiple levels, e.g.:

df = df.iloc[(df.index.get_level_values(0).str.contains('ba')) | (df.index.get_level_values(1).str.contains('ne'))]

returns:

bar one  0.620279  1.525277  0.379649 -0.032608
    two  0.465240 -0.190038  0.795730  1.720368
baz one  0.986828 -0.080394 -0.303319  0.747483
    two  0.487534  1.597006  0.114551  0.299502
foo one -0.085700  0.112433  0.704043  0.264280
qux one -0.291758 -1.071669  0.794354 -1.805530

Pandas dataframe with MultiIndex: check if string is contained in index level

Tags:

python

pandas

Cord Kaldemeyer

People also ask

1 Answers

Fabio Lamanna

Recent Activity

Donate For Us

Pandas dataframe with MultiIndex: check if string is contained in index level

Tags:

python

pandas

Cord Kaldemeyer

People also ask

1 Answers

Fabio Lamanna

Related questions

Recent Activity

Donate For Us