Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bug or meant to be: numpy raises "ValueError: too many boolean indices" for repeated boolean indices

I am doing some simulations in experimental cosmology, and encountered this problem while working with numpy arrays. I'm new to numpy, so I am not sure if I'm doing this wrong or if it's a bug. I run:

Enthought Python Distribution -- www.enthought.com
Version: 7.3-1 (32-bit)

Python 2.7.3 |EPD 7.3-1 (32-bit)| (default, Apr 12 2012, 11:28:34) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "credits", "demo" or "enthought" for more information.
>>> import numpy as np
>>> t = np.arange(10)
>>> t[t < 8][t < 5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many boolean indices
>>> 

I expected it to return:

array([0, 1, 2, 3, 4])

since t[t < 8] should presumably be treated as just another ndarray?

The numpy documentation (http://docs.scipy.org/doc/numpy/user/basics.indexing.html) says about boolean arrays as indices:

As with index arrays, what is returned is a copy of the data, not a view as one gets with slices.

running type(t[t < 8]) also gives ndarray, which I guess should have all the properties of a numpy array. Should I perhaps do this better with list expressions? I have not done a timed comparison yet, but I would imagine this to be a problem for large 2D arrays.

like image 659
plan Avatar asked Apr 17 '13 22:04

plan


2 Answers

t[ t < 8 ] does indeed give you an array, however, it doesn't give you an array of the same size that you started with. t < 8 will give a boolean array with the same shape as t. When you use that to index t, you pull out only the elements where the boolean array is True leaving you with a shorter array. When you do that again:

result = t[t<8]
result[t<5]

Then the boolean index array is the same shape as t once again, but you're using it to index a smaller array which is what causes the error.

The documentation is completely correct. Your new array isn't a view into the original ... It is a copy of the data, but that doesn't imply that the new array is the same shape or size as the original.

like image 197
mgilson Avatar answered Nov 14 '22 20:11

mgilson


This is meant to be. The reference to 't' is meaningless by the time you reach the second boolean statement. In the first statement you're segmenting t by values less than 8. In the second you're still segmenting 't', but on a temporary array (call it 's'). The index request on 's' can't always be mapped to 't' correctly. Thus it throws an exception.

If you want to do multiple boolean statements. Combine them so that it reads:

s = t[t < 8]
s[s < 5]

Or alternatively from @mgilson:

t[np.logical_and(t < 8, t < 5)]
like image 28
Pyrce Avatar answered Nov 14 '22 21:11

Pyrce