Is there a better way to count how many times a given row appears in a numpy 2D array than
def get_count(array_2d, row):
count = 0
# iterate over rows, compare
for r in array_2d[:,]:
if np.equal(r, row).all():
count += 1
return count
# let's make sure it works
array_2d = np.array([[1,2], [3,4]])
row = np.array([1,2])
count = get_count(array_2d, row)
assert(count == 1)
One simple way would be with broadcasting
-
(array_2d == row).all(-1).sum()
Considering memory efficiency, here's one approach considering each row from array_2d
as an indexing tuple on an n-dimensional
grid and assuming positive numbers in the inputs -
dims = np.maximum(array_2d.max(0),row) + 1
array_1d = np.ravel_multi_index(array_2d.T,dims)
row_scalar = np.ravel_multi_index(row,dims)
count = (array_1d==row_scalar).sum()
Here's a post discussing the various aspects related to it.
Note: Using np.count_nonzero
could be much faster to count booleans instead of summation with .sum()
. So, do consider using it for both the above mentioned aproaches.
Here's a quick runtime test -
In [74]: arr = np.random.rand(10000)>0.5
In [75]: %timeit arr.sum()
10000 loops, best of 3: 29.6 µs per loop
In [76]: %timeit np.count_nonzero(arr)
1000000 loops, best of 3: 1.21 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With