Find the row indexes of several values in a numpy array

Tags:

I have an array X:

X = np.array([[4,  2],               [9,  3],               [8,  5],               [3,  3],               [5,  6]])

And I wish to find the index of the row of several values in this array:

searched_values = np.array([[4, 2],                             [3, 3],                             [5, 6]])

For this example I would like a result like:

[0,3,4]

I have a code doing this, but I think it is overly complicated:

X = np.array([[4,  2],               [9,  3],               [8,  5],               [3,  3],               [5,  6]])  searched_values = np.array([[4, 2],                             [3, 3],                             [5, 6]])  result = []  for s in searched_values:     idx = np.argwhere([np.all((X-s)==0, axis=1)])[0][1]     result.append(idx)  print(result)

I found this answer for a similar question but it works only for 1d arrays.

Is there a way to do what I want in a simpler way?

540

asked Jul 30 '16 12:07

Octoplus

1 Answers

Approach #1

One approach would be to use NumPy broadcasting, like so -

np.where((X==searched_values[:,None]).all(-1))[1]

Approach #2

A memory efficient approach would be to convert each row as linear index equivalents and then using np.in1d, like so -

dims = X.max(0)+1 out = np.where(np.in1d(np.ravel_multi_index(X.T,dims),\                     np.ravel_multi_index(searched_values.T,dims)))[0]

Approach #3

Another memory efficient approach using np.searchsorted and with that same philosophy of converting to linear index equivalents would be like so -

dims = X.max(0)+1 X1D = np.ravel_multi_index(X.T,dims) searched_valuesID = np.ravel_multi_index(searched_values.T,dims) sidx = X1D.argsort() out = sidx[np.searchsorted(X1D,searched_valuesID,sorter=sidx)]

Please note that this np.searchsorted method assumes there is a match for each row from searched_values in X.

How does `np.ravel_multi_index` work?

This function gives us the linear index equivalent numbers. It accepts a 2D array of n-dimensional indices, set as columns and the shape of that n-dimensional grid itself onto which those indices are to be mapped and equivalent linear indices are to be computed.

Let's use the inputs we have for the problem at hand. Take the case of input X and note the first row of it. Since, we are trying to convert each row of X into its linear index equivalent and since np.ravel_multi_index assumes each column as one indexing tuple, we need to transpose X before feeding into the function. Since, the number of elements per row in X in this case is 2, the n-dimensional grid to be mapped onto would be 2D. With 3 elements per row in X, it would had been 3D grid for mapping and so on.

To see how this function would compute linear indices, consider the first row of X -

In [77]: X Out[77]:  array([[4, 2],        [9, 3],        [8, 5],        [3, 3],        [5, 6]])

We have the shape of the n-dimensional grid as dims -

In [78]: dims Out[78]: array([10,  7])

Let's create the 2-dimensional grid to see how that mapping works and linear indices get computed with np.ravel_multi_index -

In [79]: out = np.zeros(dims,dtype=int)  In [80]: out Out[80]:  array([[0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0]])

Let's set the first indexing tuple from X, i.e. the first row from X into the grid -

In [81]: out[4,2] = 1  In [82]: out Out[82]:  array([[0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 1, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0],        [0, 0, 0, 0, 0, 0, 0]])

Now, to see the linear index equivalent of the element just set, let's flatten and use np.where to detect that 1.

In [83]: np.where(out.ravel())[0] Out[83]: array([30])

This could also be computed if row-major ordering is taken into account.

Let's use np.ravel_multi_index and verify those linear indices -

In [84]: np.ravel_multi_index(X.T,dims) Out[84]: array([30, 66, 61, 24, 41])

Thus, we would have linear indices corresponding to each indexing tuple from X, i.e. each row from X.

Choosing dimensions for np.ravel_multi_index to form unique linear indices

Now, the idea behind considering each row of X as indexing tuple of a n-dimensional grid and converting each such tuple to a scalar is to have unique scalars corresponding to unique tuples, i.e. unique rows in X.

Let's take another look at X -

In [77]: X Out[77]:  array([[4, 2],        [9, 3],        [8, 5],        [3, 3],        [5, 6]])

Now, as discussed in the previous section, we are considering each row as indexing tuple. Within each such indexing tuple, the first element would represent the first axis of the n-dim grid, second element would be the second axis of the grid and so on until the last element of each row in X. In essence, each column would represent one dimension or axis of the grid. If we are to map all elements from X onto the same n-dim grid, we need to consider the maximum stretch of each axis of such a proposed n-dim grid. Assuming we are dealing with positive numbers in X, such a stretch would be the maximum of each column in X + 1. That + 1 is because Python follows 0-based indexing. So, for example X[1,0] == 9 would map to the 10th row of the proposed grid. Similarly, X[4,1] == 6 would go to the 7th column of that grid.

So, for our sample case, we had -

In [7]: dims = X.max(axis=0) + 1 # Or simply X.max(0) + 1  In [8]: dims Out[8]: array([10,  7])

Thus, we would need a grid of at least a shape of (10,7) for our sample case. More lengths along the dimensions won't hurt and would give us unique linear indices too.

Concluding remarks : One important thing to be noted here is that if we have negative numbers in X, we need to add proper offsets along each column in X to make those indexing tuples as positive numbers before using np.ravel_multi_index.

164

answered Sep 22 '22 19:09

Divakar

Related questions
                            
                                How to use Python to programmatically generate part of Sphinx documentation?
                            
                                Plotting directed graphs in Python in a way that show all edges separately
                            
                                Use Python Selenium to get span text
                            
                                Python SVG parser
                            
                                How to make a shallow copy of a list in Python
                            
                                Celery Logs into file
                            
                                Python: Argument Parsing Validation Best Practices
                            
                                Django: timezone.now vs timezone.now()
                            
                                Python - object MagicMock can't be used in 'await' expression
                            
                                Could not install packages due to an EnvironmentError: [Errno 28] No space left on device
                            
                                seriously simple python HTTP proxy? [duplicate]
                            
                                Python Image Library: How to combine 4 images into a 2 x 2 grid?
                            
                                Get cookie from CookieJar by name
                            
                                Python http.client json request and response. How?
                            
                                python plot simple histogram given binned data
                            
                                Python: running subprocess in parallel [duplicate]
                            
                                SSL error : routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
                            
                                Is it safe to rely on Python function arguments evaluation order? [duplicate]
                            
                                Determining when a column value changes in pandas dataframe
                            
                                How to generate dynamic urls in flask?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find the row indexes of several values in a numpy array

Tags:

python

arrays

numpy

Octoplus

People also ask

1 Answers

How does `np.ravel_multi_index` work?

Divakar

Recent Activity

Donate For Us

Find the row indexes of several values in a numpy array

Tags:

python

arrays

numpy

Octoplus

People also ask

1 Answers

How does np.ravel_multi_index work?

Divakar

Related questions

Recent Activity

Donate For Us

How does `np.ravel_multi_index` work?