Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find indices of a reordered numpy array?

Say I have a sorted numpy array:

arr = np.array([0.0, 0.0],
               [0.5, 0.0],
               [1.0, 0.0],
               [0.0, 0.5],
               [0.5, 0.5],
               [1.0, 0.5],
               [0.0, 1.0],
               [0.5, 1.0],
               [1.0, 1.0])

and suppose I make a non trivial operation on it such that I have a new array which is the same as the old one but in another order:

arr2 = np.array([0.5, 0.0],
                [0.0, 0.0],
                [0.0, 0.5],
                [1.0, 0.0],
                [0.5, 0.5],
                [1.0, 0.5],
                [0.0, 1.0],
                [1.0, 1.0],
                [0.5, 1.0])

The question is: how do you get the indices of where each element of arr2 are placed in arr. In other terms, I want a method that takes both arrays and return an array the same length as arr2 but with the index of the element of arr. For example, the first element of the returned array would be the index of the first element of arr2 in arr.

where_things_are(arr2, arr) 
return : array([1, 0, 3, 2, 4, 5, 6, 8, 7])

Does a function like this already exists in numpy?

EDIT:

I tried:

np.array([np.where((arr == x).all(axis=1)) for x in arr2])

which returns what I want, but my question still holds: is there a more efficient way of doing this using numpy methods?

EDIT2:

It should also work if the length of arr2 is not the same as the length of the original array (like if I removed some elements from it). Thus it is not finding and inverting a permutation but rather finding where elements are located at.

like image 484
fgoudra Avatar asked Feb 14 '17 17:02

fgoudra


People also ask

How do I get the indices of sorted NumPy array?

We can get the indices of the sorted elements of a given array with the help of argsort() method. This function is used to perform an indirect sort along the given axis using the algorithm specified by the kind keyword.

Do NumPy arrays have indices?

You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

How do you find the index of an element in a 2D NumPy array?

Index of element in 2D array We can also use the np. where() function to find the position/index of occurrences of elements in a two-dimensional or multidimensional array. For a 2D array, the returned tuple will contain two numpy arrays one for the rows and the other for the columns.


Video Answer


3 Answers

The key is inverting permutations. The code below works even if the original array is not sorted. If it is sorted then find_map_sorted can be used which obviously is faster.

UPDATE: Adapting to the OP's ever changing requirements, I've added a branch that handles lost elements.

import numpy as np

def invperm(p):
    q = np.empty_like(p)
    q[p] = np.arange(len(p))
    return q

def find_map(arr1, arr2):
    o1 = np.argsort(arr1)
    o2 = np.argsort(arr2)
    return o2[invperm(o1)]

def find_map_2d(arr1, arr2):
    o1 = np.lexsort(arr1.T)
    o2 = np.lexsort(arr2.T)
    return o2[invperm(o1)]

def find_map_sorted(arr1, arrs=None):
    if arrs is None:
        o1 = np.lexsort(arr1.T)
        return invperm(o1)
    # make unique-able
    rdtype = np.rec.fromrecords(arrs[:1, ::-1]).dtype
    recstack = np.r_[arrs[:,::-1], arr1[:,::-1]].view(rdtype).view(np.recarray)
    uniq, inverse = np.unique(recstack, return_inverse=True)
    return inverse[len(arrs):]

x1 = np.random.permutation(100000)
x2 = np.random.permutation(100000)
print(np.all(x2[find_map(x1, x2)] == x1))

rows = np.random.random((100000, 8))
r1 = rows[x1, :]
r2 = rows[x2, :]
print(np.all(r2[find_map_2d(r1, r2)] == r1))

rs = r1[np.lexsort(r1.T), :]
print(np.all(rs[find_map_sorted(r2), :] == r2))

# lose ten elements
print(np.all(rs[find_map_sorted(r2[:-10], rs), :] == r2[:-10]))
like image 143
Paul Panzer Avatar answered Oct 20 '22 08:10

Paul Panzer


Here is a way using numpy Broadcasting:

In [10]: ind = np.where(arr[:, None] == arr2[None, :])[1]

In [11]: ind[np.where(np.diff(ind)==0)]
Out[11]: array([1, 0, 3, 2, 4, 5, 6, 8, 7])

The idea behind this is, increasing the dimension of arrays so that their comparison produces a 3d array which since the original sub-array have length 2 if we had two consecutive equal items in second axis of the result of comparison they would be where both items are equal. For a better demonstration here is the result of comparison without selecting the second axis:

In [96]: np.where(arr[:, None] == arr2[None, :])
Out[96]: 
(array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3,
        3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7,
        7, 7, 8, 8, 8, 8, 8, 8]),
 array([0, 1, 1, 2, 3, 6, 0, 0, 1, 3, 4, 8, 0, 1, 3, 3, 5, 7, 1, 2, 2, 4, 5,
        6, 0, 2, 4, 4, 5, 8, 2, 3, 4, 5, 5, 7, 1, 2, 6, 6, 7, 8, 0, 4, 6, 7,
        8, 8, 3, 5, 6, 7, 7, 8]),
 array([1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1,
        0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1,
        0, 1, 0, 0, 1, 0, 1, 1]))

And then for finding those items we just need to find the places that their diff is 0.

like image 34
Mazdak Avatar answered Oct 20 '22 08:10

Mazdak


The numpy_indexed package (disclaimer: i am its author) contains efficient functionality for exactly this type of problem; npi.indices is the ndarray-equivalent of list.index.

import numpy_indexed as npi
idx = npi.indices(arr, arr2)

This returns a list of indices such that arr[idx] == arr2. If arr2 contains elements not present in arr, a ValueError is raised; but you can control that with the 'missing' kwarg.

To answer your question if this functionality is included in numpy; yes, in the sense that numpy is a turing-complete ecosystem. But not really, if you count the number of lines of code required to implement this in an efficient, correct and general manner.

like image 2
Eelco Hoogendoorn Avatar answered Oct 20 '22 06:10

Eelco Hoogendoorn