Given three numpy arrays a, b, and c (EDIT: of the same shape/size), it seems that for non-complex numbers
a * b * c != 0 # test element-wise whether all are non-zero
gives the same result as:
np.logical_and(a, np.logical_and(b, c))
Is there a hidden pitfall in the first version? Is there even a simpler way to test this?
Given b and c holding real numbers, np.logical_and(b, c) would esentially involve under-the-hood conversion to boolean numbers.
Can we do the conversion upfront? If so, would that help?
Now, the stated operation of checking if ALL corresponding elements are non-zeros would be equivalent to checking if the boolean-not of ANY of the corresponding elements are zeros, i.e.
~((a == 0) + (b==0) + (c==0)
OR
~((a == 0) | (b==0) | (c==0))
Also, this would involve upfront conversion to boolean after comparison with zero, so that might help with performance. Here's the runtime numbers involved -
Case #1:
In [10]: # Setup inputs
...: M, N = 100, 100
...: a = np.random.randint(0,5,(M,N))
...: b = np.random.randint(0,5,(M,N))
...: c = np.random.randint(0,5,(M,N))
...:
In [11]: %timeit np.logical_and(a, np.logical_and(b, c))
...: %timeit a * b * c != 0
...: %timeit ~((a == 0) + (b==0) + (c==0))
...: %timeit ~((a == 0) | (b==0) | (c==0))
...:
10000 loops, best of 3: 96.6 µs per loop
10000 loops, best of 3: 78.2 µs per loop
10000 loops, best of 3: 51.6 µs per loop
10000 loops, best of 3: 51.5 µs per loop
Case #2:
In [12]: # Setup inputs
...: M, N = 1000, 1000
...: a = np.random.randint(0,5,(M,N))
...: b = np.random.randint(0,5,(M,N))
...: c = np.random.randint(0,5,(M,N))
...:
In [13]: %timeit np.logical_and(a, np.logical_and(b, c))
...: %timeit a * b * c != 0
...: %timeit ~((a == 0) + (b==0) + (c==0))
...: %timeit ~((a == 0) | (b==0) | (c==0))
...:
100 loops, best of 3: 11.4 ms per loop
10 loops, best of 3: 24.1 ms per loop
100 loops, best of 3: 9.29 ms per loop
100 loops, best of 3: 9.2 ms per loop
Case #3:
In [14]: # Setup inputs
...: M, N = 5000, 5000
...: a = np.random.randint(0,5,(M,N))
...: b = np.random.randint(0,5,(M,N))
...: c = np.random.randint(0,5,(M,N))
...:
In [15]: %timeit np.logical_and(a, np.logical_and(b, c))
...: %timeit a * b * c != 0
...: %timeit ~((a == 0) + (b==0) + (c==0))
...: %timeit ~((a == 0) | (b==0) | (c==0))
...:
1 loops, best of 3: 294 ms per loop
1 loops, best of 3: 694 ms per loop
1 loops, best of 3: 268 ms per loop
1 loops, best of 3: 268 ms per loop
Seems like there is a good percentage of benefit with the comparison to zero approach!
Some observations:
import numpy as np
import timeit
a = np.random.randint(0, 5, 100000)
b = np.random.randint(0, 5, 100000)
c = np.random.randint(0, 5, 100000)
method_one = np.logical_and(np.logical_and(a, b), c)
%timeit np.logical_and(np.logical_and(a, b), c)
method_two = a*b*c != 0
%timeit a*b*c != 0
method_three = np.logical_and(np.logical_and(a.astype('bool'), b.astype('bool')), c.astype('bool'))
%timeit np.logical_and(np.logical_and(a.astype('bool'), b.astype('bool')), c.astype('bool'))
method_four = a.astype('bool') * b.astype('bool') * c.astype('bool') != 0
%timeit a.astype('bool') * b.astype('bool') * c.astype('bool') != 0
# verify all methods give equivalent results
all([
np.all(method_one == method_two),
np.all(method_one == method_three),
np.all(method_one == method_four)
]
)
1000 loops, best of 3: 713 µs per loop
1000 loops, best of 3: 341 µs per loop
1000 loops, best of 3: 252 µs per loop
1000 loops, best of 3: 388 µs per loop
True
Some interpretations:
The speed of the a*b*c != 0 method will depend on the dtype of the vectors, since multiplication is done first. So if you've got floats or bigints or some other larger dtype, this step will take longer than for vectors of the same length that are boolean or small integers. Coercing to a bool dtype speeds up this method. If the vectors have different dtypes, things will be even slower. Multiplying an integer array by a float array requires converting integers to floats, then coercing to boolean. Not optimal.
For reasons I don't understand, Prune's answer's statement that
However, the logical test is faster seems to be correct only when the input vectors are already boolean. Perhaps the way in which coercion to boolean happens in the straight-up logical_and() method is slower than using .asdtype('bool').
The fastest way to go seems to be (1) coerce inputs to boolean ahead of time and then (2) use np.logical_and().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With