Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slice pandas DataFrame where column's value exists in another array

I have a pandas.DataFrame with a large amount of data. In one column are randomly repeating keys. In another array I have a list of of theys keys for which I would like to slice from the DataFrame along with the data from the other columns in their row.

keys:

keys = numpy.array([1,5,7])

data:

 indx   a      b     c   d
    0   5   25.0  42.1  13
    1   2   31.7  13.2   1
    2   9   16.5   0.2   9
    3   7   43.1  11.0  10
    4   1   11.2  31.6  10
    5   5   15.6   2.8  11
    6   7   14.2  19.0   4

I would like slice all rows from the DataFrame if the value in the column a matches a value from keys.

Desired result:

 indx   a      b     c   d
    0   5   25.0  42.1  13
    3   7   43.1  11.0  10
    4   1   11.2  31.6  10
    5   5   15.6   2.8  11
    6   7   14.2  19.0   4
like image 397
ryanjdillon Avatar asked Mar 20 '14 18:03

ryanjdillon


1 Answers

You can use isin:

>>> df[df.a.isin(keys)]
      a     b     c   d
indx                   
0     5  25.0  42.1  13
3     7  43.1  11.0  10
4     1  11.2  31.6  10
5     5  15.6   2.8  11
6     7  14.2  19.0   4

[5 rows x 4 columns]

or query:

>>> df.query("a in @keys")
      a     b     c   d
indx                   
0     5  25.0  42.1  13
3     7  43.1  11.0  10
4     1  11.2  31.6  10
5     5  15.6   2.8  11
6     7  14.2  19.0   4

[5 rows x 4 columns]
like image 181
DSM Avatar answered Sep 22 '22 11:09

DSM