Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indexing with Masked Arrays in numpy

I have a bit of code that attempts to find the contents of an array at indices specified by another, that may specify indices that are out of range of the former array.

input = np.arange(0, 5)
indices = np.array([0, 1, 2, 99])

What I want to do is this: print input[indices] and get [0 1 2]

But this yields an exception (as expected):

IndexError: index 99 out of bounds 0<=index<5

So I thought I could use masked arrays to hide the out of bounds indices:

indices = np.ma.masked_greater_equal(indices, 5)

But still:

>print input[indices]
IndexError: index 99 out of bounds 0<=index<5

Even though:

>np.max(indices)
2

So I'm having to fill the masked array first, which is annoying, since I don't know what fill value I could use to not select any indices for those that are out of range:

print input[np.ma.filled(indices, 0)]

[0 1 2 0]

So my question is: how can you use numpy efficiently to select indices safely from an array without overstepping the bounds of the input array?

like image 685
Widjet Avatar asked Oct 04 '10 11:10

Widjet


People also ask

What is a masked array in Numpy?

A masked array is the combination of a standard numpy. ndarray and a mask. A mask is either nomask , indicating that no value of the associated array is invalid, or an array of booleans that determines for each element of the associated array whether the value is valid or not.

Can Numpy arrays be indexed?

Array indexing is the same as accessing an array element. You can access an array element by referring to its index number. The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

Does Numpy arrays support boolean indexing?

We can also index NumPy arrays using a NumPy array of boolean values on one axis to specify the indices that we want to access. This will create a NumPy array of size 3x4 (3 rows and 4 columns) with values from 0 to 11 (value 12 not included).


2 Answers

Without using masked arrays, you could remove the indices greater or equal to 5 like this:

print input[indices[indices<5]]

Edit: note that if you also wanted to discard negative indices, you could write:

print input[indices[(0 <= indices) & (indices < 5)]]
like image 50
François Avatar answered Sep 28 '22 06:09

François


It is a VERY BAD idea to index with masked arrays. There was a (very short) time with using MaskedArrays for indexing would have thrown an exception, but it was a bit too harsh...

In your test, you're filtering indices to find the entries matching a condition. What should you do with the missing entries of your MaskedArray ? Is the condition False ? True ? Should you use a default ? It's up to you, the user, to decide what to do.

Using indices.filled(0) means that when an item of indices is masked (as in, undefined), you want to take the first index (0) as default. Probably not what you wanted.

Here, I would have simply used input[indices.compressed()] : the compressed method flattens your MaskedArray, keeping only the unmasked entries.

But as you realized, you probably didn't need MaskedArrays in the first place

like image 40
Pierre GM Avatar answered Sep 28 '22 06:09

Pierre GM