Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy: Replacing values in a 2D array efficiently using a dictionary as a map

I have a 2D Numpy array of integers like so:

a = np.array([[  3,   0,   2,  -1],
              [  1, 255,   1,   2],
              [  0,   3,   2,   2]])

and I have a dictionary with integer keys and values that I would like to use to replace the values of a with new values. The dict might look like this:

d = {0: 1, 1: 2, 2: 3, 3: 4, -1: 0, 255: 0}

I want to replace the values of a that match a key in d with the corresponding value in d. In other words, d defines a map between old (current) and new (desired) values in a. The outcome for the toy example above would be this:

a_new = np.array([[  4,   1,   3,   0],
                  [  2,   0,   2,   3],
                  [  1,   4,   3,   3]])

What would be an efficient way to implement this?

This is a toy example, but in practice the array will be large, its shape will be e.g. (1024, 2048), and the dictionary will have on the order of dozens of elements (34 in my case), and while the keys are integers, they are not necessarily all consecutive and they can be negative (like in the example above).

I need to perform this replacement on hundreds of thousands of such arrays, so it needs to be fast. However, the dictionary is known in advance and remains constant, so asymptotically, any time used to modify the dictionary or transform it into a more appropriate data structure doesn't matter.

I'm currently looping over the array entries in two nested for loops (over the rows and columns of a), but there has got to be a better way.

If the map didn't contain negative values (e.g. -1 like in the example), I would just create a list or an array from the dictionary once where the keys are the array indices and then use that for an efficient Numpy fancy indexing routine. But since there are negative values, too, this won't work.

like image 949
Alex Avatar asked Oct 21 '17 22:10

Alex


People also ask

Is NumPy array faster than dictionary?

Also as expected, the Numpy array performed faster than the dictionary.

Does map work on NumPy array?

Method 1: numpy.vectorize() function maps functions on data structures that contain a sequence of objects like NumPy arrays. The nested sequence of objects or NumPy arrays as inputs and returns a single NumPy array or a tuple of NumPy arrays.

Are NumPy arrays memory efficient?

1. NumPy uses much less memory to store data. The NumPy arrays takes significantly less amount of memory as compared to python lists. It also provides a mechanism of specifying the data types of the contents, which allows further optimisation of the code.

Can a NumPy array be a dictionary key?

You may use the data in numpy array to create a hash which could be used as a key for dictionary.


1 Answers

Here's one way, provided you have a small dictionary/min and max values, this may be more efficient, you work around the negative index by adding the array min:

In [11]: indexer = np.array([d.get(i, -1) for i in range(a.min(), a.max() + 1)])

In [12]: indexer[(a - a.min())]
Out[12]:
array([[4, 1, 3, 0],
       [2, 0, 2, 3],
       [1, 4, 3, 3]])

Note: This moves the for loop to the lookup table, but if this is significantly smaller than the actual array this could be a lot faster.

like image 63
Andy Hayden Avatar answered Sep 27 '22 18:09

Andy Hayden