I have the following data array m:
import numpy as np
a = [[1],[0],[1],[0],[0]]
b = [[1],[0],[1],[0],[0]]
c = d = [[1],[0],[1],[0],[0]]
m = np.hstack((a,b,c,d))
m
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
I have the following vector prior
prior = [0.1,0.2,0.3,0.4]
I now want to create a new vector of length 5, where each row of m is summed according to this scheme
if 1 then add 1/prior
if 0 then add 0.1*1/prior
so for the first row in m we would get
(1/0.1)+(0.1*1/0.2)+(1/0.3)+(1/0.4) = 16.33
the second row is
(0.1*1/0.1)+(0.1*1/0.2)+(0.1*1/0.3)+(0.1*1/0.4) = 2.083
m should be the basis and numpy may be used (perhaps .sum(axis=1)) ?
UPDATE :
I'm also interested in a solution where m can take more than two different integers. For example I want a third rule for m==2
and add these values 0.2*1/prior
Since you are already using numpy
I would recommend numpy.where
and numpy.sum
. Note that this works only if you make prior
a numpy.array
.
p = np.asarray(prior)
np.sum(np.where(m,1./p,0.1/p),axis=1)
# array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
Note
np.where
usually expects an array of bools
. However, when you give a list of integers
the number 0
is interpreted as a False
and everything else as a True
Update
If you want to add a third rule for the occurrence of 2
in m
I would use np.choose
instead of np.where
. If you want to have 0.2/p
for the occurrence of 2
you can do
p = np.asarray(prior)
p_vec = np.vstack((0.1/p,1./p,0.2/p))
np.choose(m,p_vec).sum(axis=1)
The idea is to create first a list p_vec
which contains 0.1/p
,1./p
and 0.2/p
. The command np.choose
picks then the corresponding entity out of the list depending on m
.
This can easily extended for integers 3,4,...
just add the corresponding data to p_vec
.
Approach #1: Vectorized approach with boolean indexing
-
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Mask of ones (1s) in array, m
mask = m==1
# Use the mask for m==1 and otherwise with proper scales: prior_reci
# and 0.1*prior_reci respectively and sum them up along the rows
out = (mask*prior_reci + ~mask*(0.1*prior_reci)).sum(1)
Sample run -
In [58]: m
Out[58]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
In [59]: prior
Out[59]: [0.1, 0.2, 0.3, 0.4]
In [60]: prior_reci = 1/np.asarray(prior)
...: mask = m==1
...:
In [61]: (mask*prior_reci + ~mask*(0.1*prior_reci)).sum(1)
Out[61]: array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
Approach #2: Using matrix-multiplication with np.dot
-
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Sum along rows for m==1 with scaling of prior_reci per row
# would be equivalent to np.dot(m,prior_reci).
# Similarly for m!=1, it would be np.dot(1-m,0.1*prior_reci)
# i.e. with the new scaling 0.1*prior_reci.
# Finally we need to combine them up with summation.
out = np.dot(m,prior_reci) + np.dot(1-m,0.1*prior_reci)
Sample run -
In [77]: m
Out[77]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[1, 1, 1, 1],
[0, 0, 0, 0],
[0, 1, 0, 0]])
In [78]: prior
Out[78]: [0.1, 0.2, 0.3, 0.4]
In [79]: prior_reci = 1/np.asarray(prior)
In [80]: np.dot(m,prior_reci) + np.dot(1-m,0.1*prior_reci)
Out[80]: array([ 16.33333333, 2.08333333, 20.83333333, 2.08333333, 6.58333333])
Runtime tests to compare the earlier listed two approaches -
In [102]: # Parameters
...: H = 1000
...: W = 1000
...:
...: # Create inputs
...: m = np.random.randint(0,2,(H,W))
...: prior = np.random.rand(W).tolist()
...:
In [103]: %%timeit
...: prior_reci1 = 1/np.asarray(prior)
...: mask = m==1
...: out1 = (mask*prior_reci1 + ~mask*(0.1*prior_reci1)).sum(1)
...:
100 loops, best of 3: 11.1 ms per loop
In [104]: %%timeit
...: prior_reci2 = 1/np.asarray(prior)
...: out2 = np.dot(m,prior_reci2) + np.dot(1-m,0.1*prior_reci2)
...:
100 loops, best of 3: 6 ms per loop
Generic solution to handle multiple conditional checks could be solved in a vectorized manner with np.einsum
-
# Define scalars that are to be matched against input 2D array, m
matches = [0,1,2,3,4] # Edit this to accomodate more matching conditions
# Define multiplying factors for the reciprocal version of prior
prior_multfactors = [0.1,1,0.2,0.3,0.4] # Edit this corresponding to matches
# for different multiplying factors
# Thus, for the given matches and prior_multfactors, it means:
# when m==0, then do: 0.1/prior
# when m==1, then do: 1/prior
# when m==2, then do: 0.2/prior
# when m==3, then do: 0.3/prior
# when m==4, then do: 0.4/prior
# Define prior list
prior = [0.1,0.2,0.3,0.4]
# Calculate the reciprocal of prior as a numpy array
prior_reci = 1/np.asarray(prior)
# Mask for every element of m satisfying or not
# all the matches to produce a 3D array mask
mask = m==np.asarray(matches)[:,None,None]
# Get scaling factors for each matches across each prior_reci value
scales = np.asarray(prior_multfactors)[:,None]*prior_reci
# Einsum-mation to give sum across rows corresponding to all matches
out = np.einsum('ijk,ik->j',mask,scales)
Sample run -
In [203]: m
Out[203]:
array([[1, 0, 1, 1],
[0, 0, 0, 0],
[4, 2, 3, 1],
[0, 0, 0, 0],
[0, 4, 2, 0]])
In [204]: matches, prior_multfactors
Out[204]: ([0, 1, 2, 3, 4], [0.1, 1, 0.2, 0.3, 0.4])
In [205]: prior
Out[205]: [0.1, 0.2, 0.3, 0.4]
In [206]: prior_reci = 1/np.asarray(prior)
...: mask = m==np.asarray(matches)[:,None,None]
...: scales = np.asarray(prior_multfactors)[:,None]*prior_reci
...:
In [207]: np.einsum('ijk,ik->j',mask,scales)
Out[207]: array([ 16.33333333, 2.08333333, 8.5 , 2.08333333, 3.91666667])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With