Numpy Indexing of 2 Arrays

Tags:

Consider two numpy arrays

a = np.array(['john', 'bill', 'greg', 'bill', 'bill', 'greg', 'bill'])
b = np.array(['john', 'bill', 'greg'])

How would I be able to produce a third array

c = np.array([0,1,2,1,1,2,1])

The same length as a representing the index of each entry of a in the array b?

I can see a way by looping over the elements of b as b[i] and checking np.where(a == b[i]) but was wondering if numpy could accomplish this in a quicker/better/less lines of code way.

869

asked May 12 '14 15:05

2 Answers

Here is one option:

import numpy as np

a = np.array(['john', 'bill', 'greg', 'bill', 'bill', 'greg', 'bill'])
b = np.array(['john', 'bill', 'greg'])

my_dict = dict(zip(b, range(len(b))))

result = np.vectorize(my_dict.get)(a)

Result:

>>> result
array([0, 1, 2, 1, 1, 2, 1])

answered Sep 28 '22 13:09

Akavall

Sorting is a good option for vectorization with numpy:

>>> s = np.argsort(b)
>>> s[np.searchsorted(b, a, sorter=s)]
array([0, 1, 2, 1, 1, 2, 1], dtype=int64)

If your array a has m elements and b has n, the sorting is going to be O(n log n), and the searching O(m log n), which is not bad. Dictionary based solutions should be amortized linear, but if the arrays are not huge the Python looping may make them slower than this. And broadcasting based solutions have quadratic complexity, they will only be faster for very small arrays.

Some timings with your sample:

In [3]: %%timeit
   ...: s = np.argsort(b)
   ...: np.take(s, np.searchsorted(b, a, sorter=s))
   ...: 
100000 loops, best of 3: 4.16 µs per loop

In [5]: %%timeit
   ...: my_dict = dict(zip(b, range(len(b))))
   ...: np.vectorize(my_dict.get)(a)
   ...: 
10000 loops, best of 3: 29.9 µs per loop

In [7]: %timeit (np.arange(b.size)*(a==b[:,newaxis]).T).sum(axis=-1)
100000 loops, best of 3: 18.5 µs per loop

answered Sep 28 '22 11:09

Jaime

Related questions
                            
                                Get status text after failed http-request
                            
                                change request.GET QueryDict values
                            
                                converting binary to utf-8 in python
                            
                                How to store os.system() output in a variable or a list in python [duplicate]
                            
                                Efficient Vector / Point class in Python
                            
                                Removing specific ticks from matplotlib plot
                            
                                Can't install discount with pip: error: command 'cc' failed with exit status 1
                            
                                Configuring Django
                            
                                Flask app gives ubiquitous 404 when proxied through nginx
                            
                                Pandas: Impute NaN's
                            
                                Export Django Database into YAML file
                            
                                Python string with space and without space at the end and immutability
                            
                                Fast ping sweep in python
                            
                                Adding lines after specific line
                            
                                Pandas seems to ignore first column name when reading tab-delimited data, gives KeyError
                            
                                How to convert tuple to a multi nested dictionary in python?
                            
                                Python - Remove list(s) from list of lists (Similar functionality to .pop() )
                            
                                How can I ignore zeros when I take the median on columns of an array?
                            
                                Display notifications in Gnome Shell
                            
                                How to check which arguments a function/method takes? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy Indexing of 2 Arrays

Tags:

python

arrays

numpy

rwolst

People also ask

2 Answers

Akavall

Jaime

Recent Activity

Donate For Us