I am trying to get the frequency count (without the zeros) per sub-array in a numpy 3d-array. However, the scipy.stats.itemfreq tool returns the frequency count in a 2d array.
What I get is:
array_3d= array([[[1, 0, 0],
[1, 0, 0],
[0, 2, 0]],
[[0, 0, 0],
[0, 0, 3],
[3, 3, 3]],
[[0, 0, 4],
[0, 0, 4],
[0, 0, 4]]])
>>> itemfreq(array_3d)[1:,]
# outputs
array([ 1, 2],
[ 2, 1],
[ 3, 4],
[ 4, 3]], dtype=int64)
While I would like the output:
array([[ 1, 2, 2, 1],
[ 3, 4],
[ 4, 3]], dtype=object)
The idea is that the uneven number is always the unique value and the even number the frequency.
Another output could be:
array([ 1, 2, 0],
[ 2, 1, 0],
[ 3, 4, 1],
[ 4, 3, 2]], dtype=int64)
Where the third column represents the subset number in the 3d array.
I am also open to other outputs/solutions!
Thanks in advance!
The numpy_indexed package (disclaimer: I am its author) can be used to solve this problem in an elegant and vectorized manner:
import numpy_indexed as npi
index = np.arange(array_3d.size) // array_3d[0].size
(value, index), count = npi.count((array_3d.flatten(), index))
This then gives:
index = [0 0 0 1 1 2 2]
value = [0 1 2 0 3 0 4]
count = [6 2 1 5 4 6 3]
Which can be postprocessed by indexing with value > 0 if so desired
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With