Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

slicing pandas dataframe with an array of indices

Tags:

python

pandas

I have a pandas dataframe like this..

df = pd.DataFrame({'A' : [5,6,3,4,4,5,6,7,12,13], 'B' : 
     [1,2,3,5,5,6,7,8,9,10,]})

df

    A   B
0   5   1
1   6   2
2   3   3
3   4   5
4   4   5
5   5   6
6   6   7  
7   7   8
8  12   9
9  13  10

and I have an array of indices

array = np.array([0,1,2,4,7,8])

Now I can subset the dataframe with the array indices like this

df.iloc[array]

Which gives me a dataframe with indices present in the array.

    A  B
0   5  1
1   6  2
2   3  3
4   4  5
7   7  8
8  12  9

Now I want all the rows which are not present in the array index, row index which i want is [3,5,6,9] I am trying to do something like this but it gives me an error.

df.iloc[~loc]

like image 960
Neil Avatar asked Jan 31 '16 04:01

Neil


1 Answers

You can use isin with inverting a boolean Series by ~:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A' : [5,6,3,4,4,5,6,7,12,13], 'B' : 
     [1,2,3,5,5,6,7,8,9,10,]})

print df
    A   B
0   5   1
1   6   2
2   3   3
3   4   5
4   4   5
5   5   6
6   6   7
7   7   8
8  12   9
9  13  10

array = np.array([0,1,2,4,7,8])
print array
[0 1 2 4 7 8]

print df.index.isin(array)
[ True  True  True False  True False False  True  True False]

print ~df.index.isin(array)
[False False False  True False  True  True False False  True]

print df[ ~df.index.isin(array)]
    A   B
3   4   5
5   5   6
6   6   7
9  13  10
like image 79
jezrael Avatar answered Oct 18 '22 10:10

jezrael