I have a 2D array of arrays defined as follows:
traces = [['x1',11026,0,0,0,0],
['x0',11087,0,0,0,1],
['x0',11088,0,0,1,3],
['x0',11088,0,0,0,3],
['x0',11088,0,1,0,1]]
I want to find the index of the row which matches multiple conditions of selected columns. For example I want to find the row in this array where
row[0]=='x0' & row[1]==11088 & row[3]==1 & row[5]=1
Searching on this criteria should return 4.
I attempted to use numpy.where but can't seem to make it work with multiple conditions
print np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3] == 1) & (traces[:,5] == 1))
The above creates the warning
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison print np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3]
== 1) & (traces[:,5] == 1)) (array([], dtype=int32),)
I've attempted to use numpy.logical_and as well and that doesn't seem to work either, creating similar warnings.
Any way I can do this using numpy.where without iterating over the whole 2D array?
Thanks
I strongly assume you did something like this (conversion to np.array):
traces = [['x1',11026,0,0,0,0],
['x0',11087,0,0,0,1],
['x0',11088,0,0,1,3],
['x0',11088,0,0,0,3],
['x0',11088,0,1,0,1]]
traces = np.array(traces)
This exhibits the described error. The reason can be seen by printing the resulting array:
print(traces)
# array([['x1', '11026', '0', '0', '0', '0'],
# ['x0', '11087', '0', '0', '0', '1'],
# ['x0', '11088', '0', '0', '1', '3'],
# ['x0', '11088', '0', '0', '0', '3'],
# ['x0', '11088', '0', '1', '0', '1']],
# dtype='<U5')
Numbers were converted to strings!
When constructing an array that contains values of different types, numpy usually creates an array of dtype=object. This works in most cases but has bad performance.
However, in this case numpy apparently tried to be smart and converted the data to a string type, which is more specific than object but general enough to take numbers - as strings.
As a solution construct the array explicitly as an "object array":
traces = np.array(traces, dtype='object')
print(np.where((traces[:,0] == 'x0') & (traces[:,1] == 11088) & (traces[:,3] == 1) & (traces[:,5] == 1)))
# (array([4], dtype=int32),)
Note that although this works, object arrays are often not a good idea to use. Consider instead to replace the strings in the first column with numeric values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With