When I use this random generator: numpy.random.multinomial
, I keep getting:
ValueError: sum(pvals[:-1]) > 1.0
I am always passing the output of this softmax function:
def softmax(w, t = 1.0):
e = numpy.exp(numpy.array(w) / t)
dist = e / np.sum(e)
return dist
except now that I am getting this error, I also added this for the parameter (pvals
):
while numpy.sum(pvals) > 1:
pvals /= (1+1e-5)
but that didn't solve it. What is the right way to make sure I avoid this error?
EDIT: here is function that includes this code
def get_MDN_prediction(vec):
coeffs = vec[::3]
means = vec[1::3]
stds = np.log(1+np.exp(vec[2::3]))
stds = np.maximum(stds, min_std)
coe = softmax(coeffs)
while np.sum(coe) > 1-1e-9:
coe /= (1+1e-5)
coeff = unhot(np.random.multinomial(1, coe))
return np.random.normal(means[coeff], stds[coeff])
And in particular, you’ll often need to work with normally distributed numbers. The NumPy random normal function generates a sample of numbers drawn from the normal distribution, otherwise called the Gaussian distribution.
We do not need truly random numbers, unless its related to security (e.g. encryption keys) or the basis of application is the randomness (e.g. Digital roulette wheels). In this tutorial we will be using pseudo random numbers. NumPy offers the random module to work with random numbers.
The code size = 1000 indicates that we’re creating a NumPy array with 1000 values. That’s it. You can use the NumPy random normal function to create normally distributed data in Python.
In most cases, NumPy’s tools enable you to do one of two things: create numerical data (structured as a NumPy array), or perform some calculation on a NumPy array. The NumPy random normal function enables you to create a NumPy array that contains normally distributed data.
I also encountered this problem during my language modelling work.
The root of this problem rises from numpy's implicit data casting: the output of my sorfmax() is in float32
type, however, numpy.random.multinomial()
will cast the pval
into float64
type IMPLICITLY. This data type casting would cause pval.sum()
exceed 1.0 sometimes due to numerical rounding.
This issue is recognized and posted here
I know the question is old but since I faced the same problem just now, it seems to me it's still valid. Here's the solution I've found for it:
a = np.asarray(a).astype('float64') a = a / np.sum(a) b = np.random.multinomial(1, a, 1)
I've made the important part bold. If you omit that part the problem you've mentioned will happen from time to time. But if you change the type of array into float64, it will never happen.
Something that few people noticed: a robust version of the softmax can be easily obtained by removing the logsumexp from the values:
from scipy.misc import logsumexp
def log_softmax(vec):
return vec - logsumexp(vec)
def softmax(vec):
return np.exp(log_softmax(vec))
Just check it:
print(softmax(np.array([1.0, 0.0, -1.0, 1.1])))
Simple, isn't it?
The softmax
implementation I was using is not stable enough for the values I was using it with. As a result, sometimes the output has a sum greater than 1
(e.g. 1.0000024...
).
This case should be handled by the while loop. But sometimes the output contains NaNs, in which case the loop is never triggered, and the error persists.
Also, numpy.random.multinomial
doesn't raise an error if it sees a NaN.
Here is what I'm using right now, instead:
def softmax(vec):
vec -= min(A(vec))
if max(vec) > 700:
a = np.argsort(vec)
aa = np.argsort(a)
vec = vec[a]
i = 0
while max(vec) > 700:
i += 1
vec -= vec[i]
vec = vec[aa]
e = np.exp(vec)
return e/np.sum(e)
def sample_multinomial(w):
"""
Sample multinomial distribution with parameters given by softmax of w
Returns an int
"""
p = softmax(w)
x = np.random.uniform(0,1)
for i,v in enumerate(np.cumsum(p)):
if x < v: return i
return len(p)-1 # shouldn't happen...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With