Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

numpy - ndarray - how to remove rows based on another array

Tags:

python

numpy

I want to remove rows from a ndarray based on another array. for example:

k = [1,3,99]

n = [
  [1,'a']
  [2,'b']
  [3,'c']
  [4,'c']
  [.....]
  [99, 'a']
  [100,'e']
]

expect result:

out = [
  [2,'b']
  [4,'c']
  [.....]
  [100,'e']
]

the first column of the rows with the values in k will be removed

like image 972
Wenhui Avatar asked Jun 06 '26 17:06

Wenhui


1 Answers

You can use np.in1d to create a mask of matches between the first column of n and k and then use the inverted mask to select the non-matching rows off n, like so -

n[~np.in1d(n[:,0].astype(int), k)]

If the first column is already of int dtype, skip the .astype(int) conversion step.

Sample run -

In [41]: n
Out[41]: 
array([['1', 'a'],
       ['2', 'b'],
       ['3', 'c'],
       ['4', 'c'],
       ['99', 'a'],
       ['100', 'e']], 
      dtype='|S21')

In [42]: k
Out[42]: [1, 3, 99]

In [43]: n[~np.in1d(n[:,0].astype(int), k)]
Out[43]: 
array([['2', 'b'],
       ['4', 'c'],
       ['100', 'e']], 
      dtype='|S21')

For peformance, if the first column is sorted, we can use np.searchsorted -

mask = np.ones(n.shape[0],dtype=bool)
mask[np.searchsorted(n[:,0], k)] = 0
out = n[mask]
like image 139
Divakar Avatar answered Jun 09 '26 07:06

Divakar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!