Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group and summarize matrix by value

Tags:

python

numpy

I have two matrices, prob and totalHigh both of shape axbxcxd. a and b are coordinates. Here are two samples:

In [77]: prob[1,1,:]
Out[77]: 
array([[ 0.09,  0.01,  0.  ,  0.  ,  0.  ],
       [ 0.81,  0.09,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ]])

In [78]: totalHigh[1,1,:]
Out[78]: 
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6]])

totalHigh contains information about outcomes, unfortunately on two dimensions. Correspondingly, prob contains probabilities of these outcomes. For example, the total probability of outcome 1, at coordinates 1,1 is 0.01+0.81.

How can I remove the redundant dimension?

Expected Outcome

simplifiedHigh[1,1,:]
array([0, 1, 2, 3, 4, 5, 6])
simplifiedProb[1,1,:]
array([0.09, 0.82, 0.09, 0, 0, 0, 0])

How do I get that in the most efficient way?

like image 671
FooBar Avatar asked Dec 18 '25 06:12

FooBar


1 Answers

You can use np.bincount and np.unique -

IDs = np.unique(totalHigh_sliced)
counts = np.bincount(totalHigh_sliced.ravel(),prob_sliced.ravel())

Sample run -

In [215]: prob_sliced
Out[215]: 
array([[ 0.09,  0.01,  0.  ,  0.  ,  0.  ],
       [ 0.81,  0.09,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ]])

In [216]: totalHigh_sliced
Out[216]: 
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6]])

In [217]: IDs = np.unique(totalHigh_sliced)
     ...: counts = np.bincount(totalHigh_sliced.ravel(),prob_sliced.ravel())
     ...: 

In [218]: IDs
Out[218]: array([0, 1, 2, 3, 4, 5, 6])

In [219]: counts
Out[219]: array([ 0.09,  0.82,  0.09,  0.  ,  0.  ,  0.  ,  0.  ])
like image 53
Divakar Avatar answered Dec 20 '25 22:12

Divakar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!