Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sequence Labeling in TensorFlow

I have managed to train a word2vec with tensorflow and I want to feed those results into an rnn with lstm cells for sequence labeling.

1) It's not really clear on how to use your trained word2vec model for a rnn. (How to feed the result?)

2) I don't find much documentation on how to implement a sequence labeling lstm. (How do I bring in my labels?)

Could someone point me in the right direction on how to start with this task?

like image 684
Milan Avatar asked Dec 25 '15 19:12

Milan


People also ask

What is sequence labeling in machine learning?

In machine learning, sequence labeling is a type of pattern recognition task that involves the algorithmic assignment of a categorical label to each member of a sequence of observed values.

What is BiLSTM CRF?

The BiLSTM-CRF is a recurrent neural network obtained. from the combination of a long short-term memory (LSTM) and a conditional random field (CRF) (Huang et al., 2015; Lample et al., 2016).


2 Answers

I suggest you start by reading the RNN tutorial and sequence-to-sequence tutorial. They explain how to build LSTMs in TensorFlow. Once you're comfortable with that, you'll have to find the right embedding Variable and assign it using your pre-trained word2vec model.

like image 57
Lukasz Kaiser Avatar answered Sep 21 '22 18:09

Lukasz Kaiser


I realize this was posted a while ago, but I found this Gist about sequence labeling and this Gist about variable sequence labeling really helpful for figuring out sequence labeling. The basic outline (the gist of the Gist):

  1. Use dynamic_rnn to handle unrolling your network for training and prediction. This method has moved around some in the API, so you may have to find it for your version, but just Google it.
  2. Arrange your data into batches of size [batch_size, sequence_length, num_features], and your labels into batches of size [batch_size, sequence_length, num_classes]. Note that you want a label for every time step in your sequence.
  3. For variable-length sequences, pass a value to the sequence_length argument of the dynamic_rnn wrapper for each sequence in your batch.
  4. Training the RNN is very similar to training any other neural network once you have the network structure defined: feed it training data and target labels and watch it learn!

And some caveats:

  1. With variable-length sequences, you will need to build masks for calculating your error metrics and stuff. It's all in the second link above, but don't forget when you make your own error metrics! I ran in to this a couple of times and it made my networks look like they were doing much worse on variable-length sequences.
  2. You might want to add a regularization term to your loss function. I had some convergence issues without this.
  3. I recommend using tf.train.AdamOptimizer with the default settings at first. Depending on your data, this may not converge and you will need to adjust the settings. This article does a good job of explaining what the different knobs do. Start reading from the beginning, some of the knobs are explained before the Adam section.

Hopefully these links are helpful to others in the future!

like image 21
Engineero Avatar answered Sep 21 '22 18:09

Engineero