Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implementing Bi-directional LSTM-CRF Network

Tags:

python

lstm

crf

I need to implement a bidirectional LSTM network with a CRF layer at the end. Specifically the model presented in this paper, and train it.

http://www.aclweb.org/anthology/P15-1109

I want to implement it in Python preferably. Can anyone present some libraries or sample code as to how this can be done. I looked at PyBrain but couldn't really understand it.

I'm also open to tool-kits in other programming languages.

like image 885
Samik Avatar asked Oct 12 '15 10:10

Samik


People also ask

What is LSTM CRF?

The LSTM-CRF is a hybrid graphical model which achieves state-of-the-art performance in supervised sequence labeling tasks. Collecting labeled data consumes lots of human resources and time. Thus, we want to improve the per- formance of LSTM-CRF by semi-supervised learning.

How does bidirectional LSTM work?

Bidirectional long-short term memory(bi-lstm) is the process of making any neural network o have the sequence information in both directions backwards (future to past) or forward(past to future). In bidirectional, our input flows in two directions, making a bi-lstm different from the regular LSTM.

Is BiLSTM better than LSTM?

The results show that additional training of data and thus BiLSTM-based modeling offers better predictions than regular LSTM-based models. More specifically, it was observed that BiLSTM models provide better predictions compared to ARIMA and LSTM models.

Is bidirectional LSTM better than unidirectional?

The results show that the bidirectional approach slightly enhances the recognition quality over the unidirectional approach. However, the bidirectional approach spends more time during the training, which may hinder its applicability on large datasets.


2 Answers

Here is an implementation of a bi-directional LSTM + CRF Network in TensorFlow: https://github.com/Franck-Dernoncourt/NeuroNER (works on Linux/Mac/Windows).

It gives state-of-the-art results on named-entity recognition datasets.

ANN architecture (it also uses character embeddings):

enter image description here

As viewed in TensorBoard:

enter image description here

You can also visualize the word embeddings:

enter image description here

like image 154
Franck Dernoncourt Avatar answered Sep 28 '22 03:09

Franck Dernoncourt


There's this implementation by Guillaume Lample from the paper "Neural Architectures for Named Entity Recognition" that you can use for starter.

like image 22
Jim Geovedi Avatar answered Sep 28 '22 04:09

Jim Geovedi