Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow RNNs for named entity recognition

I'm trying to work out what's the best model to adapt for an open named entity recognition problem (biology/chemistry, so no dictionary of entities exists but they have to be identified by context).

Currently my best guess is to adapt Syntaxnet so that instead of tagging words as N, V, ADJ etc, it learns to tag as BEGINNING, INSIDE, OUT (IOB notation).

However I am not sure which of these approaches is the best?

  • Syntaxnet
  • word2vec
  • seq2seq (I think this is not the right one as I need it to learn on two aligned sequences, whereas seq2seq is designed for sequences of differing lengths as in translation)

Would be grateful for a pointer to the right method! thanks!

like image 389
Tom Avatar asked Feb 18 '17 18:02

Tom


1 Answers

Syntaxnet can be used to for named entity recognition, e.g. see: Named Entity Recognition with Syntaxnet

word2vec alone isn't very effective for named entity recognition. I don't think seq2seq is commonly used either for that task.

As drpng mentions, you may want to look at tensorflow/tree/master/tensorflow/contrib/crf. Adding an LSTM before the CRF layer would help a bit, which gives something like:

enter image description here

LSTM+CRF code in TensorFlow: https://github.com/Franck-Dernoncourt/NeuroNER

like image 132
Franck Dernoncourt Avatar answered Oct 20 '22 11:10

Franck Dernoncourt