I am playing around with Keras and try to predict a word from within a context e.g. from a sentence "I have to say the food was tasty!" I hope to get something like this:
[say the ? was tasty] -> food, meals, spaghetti, drinks
However, my problem currently is that the network I am training appears to learn just the probabilities of the single words, and not the probabilities they have in a particular context.
Since the frequency of words is not balanced I thought I might/could/should apply weights to my loss function - which is currently the binary-cross entropy function.
I simply multiply the converse probability of each word with the error:
def weighted_binary_crossentropy(y_true, y_pred):
return K.mean(K.binary_crossentropy(y_pred, y_true) * (1-word_weights), axis=1)
This function is being used by the model as loss function:
model.compile(optimizer='adam', loss=weighted_binary_crossentropy)
However, my results are the exact same and I am not sure if just my model is broken or if I am using the loss paramter/function wrong.
is my weighted_binary_crossentropy() function doing what I just described? I asked because for some reason this works similar:
word_weights), axis=1)
Actually, as one may read in a documentation of a fit function, one may provide sample_weights which seem to be exactly what you want use.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With