I'm working on sentence labeling problem. I've done embedding and padding by myself and my inputs look like:
X_i = [[0,1,1,0,2,3...], [0,1,1,0,2,3...], ..., [0,0,0,0,0...], [0,0,0,0,0...], ....]
For every word in sentence I want to predict one of four classes, so my desired output should look like:
Y_i = [[1,0,0,0], [0,0,1,0], [0,1,0,0], ...]
My simple network architecture is:
model = Sequential()
model.add(LSTM(input_shape = (emb,),input_dim=emb, output_dim=hidden, return_sequences=True))
model.add(TimeDistributedDense(output_dim=4))
model.add(Activation('softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(X_train, Y_train, batch_size=32, nb_epoch=3, validation_data=(X_test, Y_test), verbose=1, show_accuracy=True)
It shows approximately 95% while training, but when I'm trying to predict new sentences using trained model results are really bad. It looks like model just learnt some classes for first words and shows it every time. I think the problem can is:
Written by myself padding (zero vectors in the end of the sentence), can it make learning worse?
I should try to learn sentences of different length, without padding (if yes, can you help me how train such kind of a model in Keras?)
Wrong objective of learning, but I tried mean squared error, binary cross entropy and others, it doesn't change.
Something with TimeDistributedDense
and softmax
, I think, that I've got how it works, but still not 100% sure.
I'll be glad to see any hint or help regarding to this problem, thank you!
I personally think that you misunderstand what "sequence labeling" means.
Do you mean:
X
is a list of sentences, each element X[i]
is a word sequence of arbitrary length?Y[i]
is the category of X[i]
, and the one hot form of Y[i]
is a [0, 1, 0, 0]
like array?If it is, then it's not a sequence labeling problem, it's a classification problem.
Don't use TimeDistributedDense
, and if it is a multi-class classification problem, i.e., len(Y[i]) > 2
, then use "categorical_crossentropy
" instead of "binary_crossentropy
"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With