Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

boolean indexing from a subset of a list in python

I have an array of names, along with a corresponding array of data. From the array of names, there is also a smaller subset of names:

data = np.array([75., 49., 80., 87., 99.])
arr1 = np.array(['Bob', 'Joe', 'Mary', 'Ellen', 'Dick'], dtype='|S5')
arr2 = np.array(['Mary', 'Dick'], dtype='|S5')

I am trying to make a new array of data corresponding only to the names that appear in arr2. This is what I have been able to come up with on my own:

TF = []
for i in arr1: 
    if i in arr2:
        TF.append(True)
    else:
        TF.append(False)
new_data = data[TF]

Is there a more efficient way of doing this that doesn't involve a for loop? I should mention that the arrays themselves are being input from an external file, and there are actually multiple arrays of data, so I can't really change anything about that.

like image 549
user3195970 Avatar asked Oct 02 '22 21:10

user3195970


1 Answers

You can use numpy.in1d, which tests whether each element in one array is also present in the second array.

Demo

>>> new_data = data[np.in1d(arr1, arr2)]
>>> new_data
array([ 80.,  99.])

in1d returns an ndarray of bools, which is analogous to the list you constructed in your original code:

>>> np.in1d(arr1, arr2)
array([False, False,  True, False,  True], dtype=bool)
like image 134
mdml Avatar answered Oct 05 '22 12:10

mdml