Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keras BLSTM for sequence labeling

I'm relatively new to neural nets so please excuse my ignorance. I'm trying to adapt the keras BLSTM example here. The example reads in texts and classifies them as 0 or 1. I want a BLSTM that does something very much like POS tagging, though extras like lemmatizing or other advanced features are not neccessary, I just want a basic model. My data is a list of sentences and each word is given a category 1-8. I want to train a BLSTM that can use this data to predict the category for each word in an unseen sentence.

e.g. input = ['The', 'dog', 'is', 'red'] gives output = [2, 4, 3, 7]

If the keras example is not the best route, I'm open to other suggestions.

I currently have this:

'''Train a Bidirectional LSTM.'''

from __future__ import print_function
import numpy as np
from keras.preprocessing import sequence
from keras.models import Model
from keras.layers import Dense, Dropout, Embedding, LSTM, Input, merge
from prep_nn import prep_scan


np.random.seed(1337)  # for reproducibility
max_features = 20000
batch_size = 16
maxlen = 18

print('Loading data...')
(X_train, y_train), (X_test, y_test) = prep_scan(nb_words=max_features,
                                                 test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print("Pad sequences (samples x time)")
# type issues here? float/int?
X_train = sequence.pad_sequences(X_train, value=0.)
X_test = sequence.pad_sequences(X_test, value=0.)  # pad with zeros

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

# need to pad y too, because more than 1 ouput value, not classification?
y_train = sequence.pad_sequences(np.array(y_train), value=0.)
y_test = sequence.pad_sequences(np.array(y_test), value=0.)

print('y_train shape:', X_train.shape)
print('y_test shape:', X_test.shape)

# this is the placeholder tensor for the input sequences
sequence = Input(shape=(maxlen,), dtype='int32')

# this embedding layer will transform the sequences of integers
# into vectors of size 128
embedded = Embedding(max_features, 128, input_length=maxlen)(sequence)

# apply forwards LSTM
forwards = LSTM(64)(embedded)
# apply backwards LSTM
backwards = LSTM(64, go_backwards=True)(embedded)

# concatenate the outputs of the 2 LSTMs
merged = merge([forwards, backwards], mode='concat', concat_axis=-1)
after_dp = Dropout(0.5)(merged)
# number after dense has to corresponse to output matrix?
output = Dense(17, activation='sigmoid')(after_dp)

model = Model(input=sequence, output=output)

# try using different optimizers and different optimizer configs
model.compile('adam', 'categorical_crossentropy', metrics=['accuracy'])

print('Train...')
model.fit(X_train, y_train,
          batch_size=batch_size,
          nb_epoch=4,
          validation_data=[X_test, y_test])

X_test_new = np.array([[0,0,0,0,0,0,0,0,0,12,3,55,4,34,5,45,3,9],[0,0,0,0,0,0,0,1,7,65,34,67,34,23,24,67,54,43,]])

classes = model.predict(X_test_new, batch_size=16)
print(classes)

My output is the right dimension, but is giving me floats 0-1. I think this is because it's still looking for binary classfication. Anyone know how to fix this?

SOLVED

Just make sure the labels are each binary arrays:

(X_train, y_train), (X_test, y_test), maxlen, word_ids, tags_ids = prep_model(
    nb_words=nb_words, test_len=75)

W = (y_train > 0).astype('float')

print(len(X_train), 'train sequences')
print(int(len(X_train)*val_split), 'validation sequences')
print(len(X_test), 'heldout sequences')

# this is the placeholder tensor for the input sequences
sequence = Input(shape=(maxlen,), dtype='int32')

# this embedding layer will transform the sequences of integers
# into vectors of size 256
embedded = Embedding(nb_words, output_dim=hidden,
                     input_length=maxlen, mask_zero=True)(sequence)

# apply forwards LSTM
forwards = LSTM(output_dim=hidden, return_sequences=True)(embedded)
# apply backwards LSTM
backwards = LSTM(output_dim=hidden, return_sequences=True,
                 go_backwards=True)(embedded)

# concatenate the outputs of the 2 LSTMs
merged = merge([forwards, backwards], mode='concat', concat_axis=-1)
after_dp = Dropout(0.15)(merged)

# TimeDistributed for sequence
# change activation to sigmoid?
output = TimeDistributed(
    Dense(output_dim=nb_classes,
          activation='softmax'))(after_dp)

model = Model(input=sequence, output=output)

# try using different optimizers and different optimizer configs
# loss=binary_crossentropy, optimizer=rmsprop
model.compile(loss='categorical_crossentropy',
              metrics=['accuracy'], optimizer='adam',
              sample_weight_mode='temporal')

print('Train...')
model.fit(X_train, y_train,
          batch_size=batch_size,
          nb_epoch=epochs,
          shuffle=True,
          validation_split=val_split,
          sample_weight=W)
like image 453
ChrisDH Avatar asked May 18 '16 18:05

ChrisDH


People also ask

How do you use LSTM for sequence classification?

Text classification using LSTM In the modelling, we are making a sequential model. The first layer of the model is the embedding layer which uses the 32 length vector, and the next layer is the LSTM layer which has 100 neurons which will work as the memory unit of the model.

Can I use LSTM for classification?

To train a deep neural network to classify sequence data, you can use an LSTM network. An LSTM network enables you to input sequence data into a network, and make predictions based on the individual time steps of the sequence data.

What is sequence tagger?

Sequence tagging can be understood as a problem where the model sees a sequence of words or tokens for each word in the sequence and it is expected to emit a label. In other words, the model is expected to tag the whole sequence of tokens with an appropriate label from a known label dictionary.


1 Answers

Solved. The main issue was reshaping the data for the classification categories as binary arrays. Also used TimeDistributed and set return_sequences to True.

like image 89
ChrisDH Avatar answered Nov 16 '22 03:11

ChrisDH