Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform multi-label learning with LSTM using theano?

I have some text data with multiple labels for each document. I want to train a LSTM network using Theano for this dataset. I came across http://deeplearning.net/tutorial/lstm.html but it only facilitates a binary classification task. If anyone has any suggestions on which method to proceed with, that will be great. I just need an initial feasible direction, I can work on.

thanks, Amit

like image 263
Amit Gupta Avatar asked Mar 17 '15 14:03

Amit Gupta


2 Answers

1) Change the last layer of the model. I.e.

pred = tensor.nnet.softmax(tensor.dot(proj, tparams['U']) + tparams['b'])

should be replaced by some other layer, e.g. sigmoid:

pred = tensor.nnet.sigmoid(tensor.dot(proj, tparams['U']) + tparams['b'])

2) The cost should also be changed.

I.e.

cost = -tensor.log(pred[tensor.arange(n_samples), y] + off).mean()

should be replaced by some other cost, e.g. cross-entropy:

one = np.float32(1.0)
pred = T.clip(pred, 0.0001, 0.9999)  # don't piss off the log
cost = -T.sum(y * T.log(pred) + (one - y) * T.log(one - pred), axis=1) # Sum over all labels
cost = T.mean(cost, axis=0) # Compute mean over samples

3) In the function build_model(tparams, options), you should replace:

y = tensor.vector('y', dtype='int64')

by

y = tensor.matrix('y', dtype='int64') # Each row of y is one sample's label e.g. [1 0 0 1 0]. sklearn.preprocessing.MultiLabelBinarizer() may be handy.

4) Change pred_error() so that it supports multilabel (e.g. using some metrics like accuracy or F1 score from scikit-learn).

like image 53
Franck Dernoncourt Avatar answered Sep 28 '22 16:09

Franck Dernoncourt


You can change the last layer of the model. It would have a vector of target where each element is 0 or 1, depending if you have the target or not.

like image 36
nouiz Avatar answered Sep 28 '22 16:09

nouiz