Numpy: For every element in one array, find the index in another array

Question

I have two 1D arrays, x & y, one smaller than the other. I'm trying to find the index of every element of y in x.

I've found two naive ways to do this, the first is slow, and the second memory-intensive.

The slow way

indices= []
for iy in y:
    indices += np.where(x==iy)[0][0]

The memory hog

xe = np.outer([1,]*len(x), y)
ye = np.outer(x, [1,]*len(y))
junk, indices = np.where(np.equal(xe, ye))

Is there a faster way or less memory intensive approach? Ideally the search would take advantage of the fact that we are searching for not one thing in a list, but many things, and thus is slightly more amenable to parallelization. Bonus points if you don't assume that every element of y is actually in x.

RomanS · Accepted Answer

I want to suggest one-line solution:

indices = np.where(np.in1d(x, y))[0]

The result is an array with indices for x array which corresponds to elements from y which were found in x.

One can use it without numpy.where if needs.

HYRY · Answer

As Joe Kington said, searchsorted() can search element very quickly. To deal with elements that are not in x, you can check the searched result with original y, and create a masked array:

import numpy as np
x = np.array([3,5,7,1,9,8,6,6])
y = np.array([2,1,5,10,100,6])

index = np.argsort(x)
sorted_x = x[index]
sorted_index = np.searchsorted(sorted_x, y)

yindex = np.take(index, sorted_index, mode="clip")
mask = x[yindex] != y

result = np.ma.array(yindex, mask=mask)
print result

the result is:

[-- 3 1 -- -- 6]

Joe Kington · Answer

How about this?

It does assume that every element of y is in x, (and will return results even for elements that aren't!) but it is much faster.

import numpy as np

# Generate some example data...
x = np.arange(1000)
np.random.shuffle(x)
y = np.arange(100)

# Actually preform the operation...
xsorted = np.argsort(x)
ypos = np.searchsorted(x[xsorted], y)
indices = xsorted[ypos]

Numpy: For every element in one array, find the index in another array

Tags:

python

arrays

indexing

search

numpy

The slow way

The memory hog

Chris

3 Answers

RomanS

HYRY

Joe Kington

Recent Activity

Donate For Us

Numpy: For every element in one array, find the index in another array

Tags:

python

arrays

indexing

search

numpy

The slow way

The memory hog

Chris

3 Answers

RomanS

HYRY

Joe Kington

Related questions

Recent Activity

Donate For Us