Numpy: Replacing values in a 2D array efficiently using a dictionary as a map

Tags:

I have a 2D Numpy array of integers like so:

a = np.array([[  3,   0,   2,  -1],
              [  1, 255,   1,   2],
              [  0,   3,   2,   2]])

and I have a dictionary with integer keys and values that I would like to use to replace the values of a with new values. The dict might look like this:

d = {0: 1, 1: 2, 2: 3, 3: 4, -1: 0, 255: 0}

I want to replace the values of a that match a key in d with the corresponding value in d. In other words, d defines a map between old (current) and new (desired) values in a. The outcome for the toy example above would be this:

a_new = np.array([[  4,   1,   3,   0],
                  [  2,   0,   2,   3],
                  [  1,   4,   3,   3]])

What would be an efficient way to implement this?

This is a toy example, but in practice the array will be large, its shape will be e.g. (1024, 2048), and the dictionary will have on the order of dozens of elements (34 in my case), and while the keys are integers, they are not necessarily all consecutive and they can be negative (like in the example above).

I need to perform this replacement on hundreds of thousands of such arrays, so it needs to be fast. However, the dictionary is known in advance and remains constant, so asymptotically, any time used to modify the dictionary or transform it into a more appropriate data structure doesn't matter.

I'm currently looping over the array entries in two nested for loops (over the rows and columns of a), but there has got to be a better way.

If the map didn't contain negative values (e.g. -1 like in the example), I would just create a list or an array from the dictionary once where the keys are the array indices and then use that for an efficient Numpy fancy indexing routine. But since there are negative values, too, this won't work.

949

asked Oct 21 '17 22:10

Alex

1 Answers

Here's one way, provided you have a small dictionary/min and max values, this may be more efficient, you work around the negative index by adding the array min:

In [11]: indexer = np.array([d.get(i, -1) for i in range(a.min(), a.max() + 1)])

In [12]: indexer[(a - a.min())]
Out[12]:
array([[4, 1, 3, 0],
       [2, 0, 2, 3],
       [1, 4, 3, 3]])

Note: This moves the for loop to the lookup table, but if this is significantly smaller than the actual array this could be a lot faster.

answered Sep 27 '22 18:09

Andy Hayden

Related questions
                            
                                Using deep learning models from TensorFlow in other language environments [closed]
                            
                                Set number of lags in Python pandas autocorrelation_plot
                            
                                Find all n-dimensional lines and diagonals with NumPy
                            
                                How to stop writing a blank line at the end of csv file - pandas
                            
                                Python sys.executable is empty
                            
                                Add context to every Django Admin page
                            
                                How to send a json object using tcp socket in python
                            
                                TypeError: argument 1 must have a "write" method
                            
                                FIX protocol in Python - implement login and request for streaming quote
                            
                                Can Canny in OpenCV deal with both grayscale and color images?
                            
                                Algorithm for itertools.combinations in Python
                            
                                Using python type hints with numba
                            
                                How to test a Django on_commit hook without clearing the database?
                            
                                Jupyter Notebook timeout waiting for response in Chrome
                            
                                Summing rows in grouped pandas dataframe and return NaN
                            
                                Python matplotlib colorbar scientific notation base
                            
                                Django rest framework: Get detail view using a field other than primary key integer id
                            
                                Python Build Error: failed to build modules _ssl and _hashlib
                            
                                Python 2 and 3 're.sub' inconsistency
                            
                                Using tkinter to input into a variable, to be called

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy: Replacing values in a 2D array efficiently using a dictionary as a map

Tags:

python

arrays

dictionary

numpy

Alex

People also ask

1 Answers

Andy Hayden

Recent Activity

Donate For Us