Sequence labeling in Keras

Question

I'm working on sentence labeling problem. I've done embedding and padding by myself and my inputs look like:

X_i = [[0,1,1,0,2,3...], [0,1,1,0,2,3...], ..., [0,0,0,0,0...],  [0,0,0,0,0...], ....]

For every word in sentence I want to predict one of four classes, so my desired output should look like:

Y_i = [[1,0,0,0], [0,0,1,0], [0,1,0,0], ...]

My simple network architecture is:

model = Sequential()

model.add(LSTM(input_shape = (emb,),input_dim=emb, output_dim=hidden, return_sequences=True))
model.add(TimeDistributedDense(output_dim=4))
model.add(Activation('softmax'))
    model.compile(loss='binary_crossentropy', optimizer='adam')

model.fit(X_train, Y_train, batch_size=32, nb_epoch=3, validation_data=(X_test, Y_test), verbose=1, show_accuracy=True)

It shows approximately 95% while training, but when I'm trying to predict new sentences using trained model results are really bad. It looks like model just learnt some classes for first words and shows it every time. I think the problem can is:

Written by myself padding (zero vectors in the end of the sentence), can it make learning worse?
I should try to learn sentences of different length, without padding (if yes, can you help me how train such kind of a model in Keras?)
Wrong objective of learning, but I tried mean squared error, binary cross entropy and others, it doesn't change.
Something with TimeDistributedDense and softmax, I think, that I've got how it works, but still not 100% sure.

I'll be glad to see any hint or help regarding to this problem, thank you!

James · Accepted Answer

I personally think that you misunderstand what "sequence labeling" means.

Do you mean:

X is a list of sentences, each element X[i] is a word sequence of arbitrary length?
Y[i] is the category of X[i], and the one hot form of Y[i] is a [0, 1, 0, 0] like array?

If it is, then it's not a sequence labeling problem, it's a classification problem.

Don't use TimeDistributedDense, and if it is a multi-class classification problem, i.e., len(Y[i]) > 2, then use "categorical_crossentropy" instead of "binary_crossentropy"

Sequence labeling in Keras

Tags:

machine-learning

deep-learning

keras

lstm

recurrent-neural-network

Rachnog

1 Answers

James

Recent Activity

Donate For Us

Sequence labeling in Keras

Tags:

machine-learning

deep-learning

keras

lstm

recurrent-neural-network

Rachnog

1 Answers

James

Related questions

Recent Activity

Donate For Us