Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - filter and regex search the index of DataFrame

I have a DataFrame in which the columns are MultiIndex and the index is a list of names, ie index=['Andrew', 'Bob', 'Calvin',...].

I would like to create a function to return all rows of the dataframe that use the name 'Bob' or perhaps start with the letter 'A' or start with lowercase. How can this be done?

I looked into the df.filter() with the regex argument, but it fails and I get:

df.filter(regex='a')
TypeError: expected string or buffer

or:

df.filter(regex=('a',1)
TypeError: first argument must be string or compiled pattern

I've tried other things such as passing re.compile('a') to no avail.

like image 310
Shatnerz Avatar asked Feb 25 '16 21:02

Shatnerz


1 Answers

So it looks like part of my problem with filter was that I was using an outdated version of pandas. After updating I no longer get the TypeError. After some playing around, it looks like I can use filter to fit my needs. Here is what I found out.

Simply setting df.filter(regex='string') will return the columns which match the regex. This looks to do the same as df.filter(regex='string', axis=1).

To search the index, I simply need to do df.filter(regex='string', axis=0)

like image 107
Shatnerz Avatar answered Oct 21 '22 03:10

Shatnerz