keras BLSTM for sequence labeling

Tags:

I'm relatively new to neural nets so please excuse my ignorance. I'm trying to adapt the keras BLSTM example here. The example reads in texts and classifies them as 0 or 1. I want a BLSTM that does something very much like POS tagging, though extras like lemmatizing or other advanced features are not neccessary, I just want a basic model. My data is a list of sentences and each word is given a category 1-8. I want to train a BLSTM that can use this data to predict the category for each word in an unseen sentence.

e.g. input = ['The', 'dog', 'is', 'red'] gives output = [2, 4, 3, 7]

If the keras example is not the best route, I'm open to other suggestions.

I currently have this:

Click to copy

'''Train a Bidirectional LSTM.'''

from __future__ import print_function
import numpy as np
from keras.preprocessing import sequence
from keras.models import Model
from keras.layers import Dense, Dropout, Embedding, LSTM, Input, merge
from prep_nn import prep_scan


np.random.seed(1337)  # for reproducibility
max_features = 20000
batch_size = 16
maxlen = 18

print('Loading data...')
(X_train, y_train), (X_test, y_test) = prep_scan(nb_words=max_features,
                                                 test_split=0.2)
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print("Pad sequences (samples x time)")
# type issues here? float/int?
X_train = sequence.pad_sequences(X_train, value=0.)
X_test = sequence.pad_sequences(X_test, value=0.)  # pad with zeros

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

# need to pad y too, because more than 1 ouput value, not classification?
y_train = sequence.pad_sequences(np.array(y_train), value=0.)
y_test = sequence.pad_sequences(np.array(y_test), value=0.)

print('y_train shape:', X_train.shape)
print('y_test shape:', X_test.shape)

# this is the placeholder tensor for the input sequences
sequence = Input(shape=(maxlen,), dtype='int32')

# this embedding layer will transform the sequences of integers
# into vectors of size 128
embedded = Embedding(max_features, 128, input_length=maxlen)(sequence)

# apply forwards LSTM
forwards = LSTM(64)(embedded)
# apply backwards LSTM
backwards = LSTM(64, go_backwards=True)(embedded)

# concatenate the outputs of the 2 LSTMs
merged = merge([forwards, backwards], mode='concat', concat_axis=-1)
after_dp = Dropout(0.5)(merged)
# number after dense has to corresponse to output matrix?
output = Dense(17, activation='sigmoid')(after_dp)

model = Model(input=sequence, output=output)

# try using different optimizers and different optimizer configs
model.compile('adam', 'categorical_crossentropy', metrics=['accuracy'])

print('Train...')
model.fit(X_train, y_train,
          batch_size=batch_size,
          nb_epoch=4,
          validation_data=[X_test, y_test])

X_test_new = np.array([[0,0,0,0,0,0,0,0,0,12,3,55,4,34,5,45,3,9],[0,0,0,0,0,0,0,1,7,65,34,67,34,23,24,67,54,43,]])

classes = model.predict(X_test_new, batch_size=16)
print(classes)

My output is the right dimension, but is giving me floats 0-1. I think this is because it's still looking for binary classfication. Anyone know how to fix this?

SOLVED

Just make sure the labels are each binary arrays:

Click to copy

(X_train, y_train), (X_test, y_test), maxlen, word_ids, tags_ids = prep_model(
    nb_words=nb_words, test_len=75)

W = (y_train > 0).astype('float')

print(len(X_train), 'train sequences')
print(int(len(X_train)*val_split), 'validation sequences')
print(len(X_test), 'heldout sequences')

# this is the placeholder tensor for the input sequences
sequence = Input(shape=(maxlen,), dtype='int32')

# this embedding layer will transform the sequences of integers
# into vectors of size 256
embedded = Embedding(nb_words, output_dim=hidden,
                     input_length=maxlen, mask_zero=True)(sequence)

# apply forwards LSTM
forwards = LSTM(output_dim=hidden, return_sequences=True)(embedded)
# apply backwards LSTM
backwards = LSTM(output_dim=hidden, return_sequences=True,
                 go_backwards=True)(embedded)

# concatenate the outputs of the 2 LSTMs
merged = merge([forwards, backwards], mode='concat', concat_axis=-1)
after_dp = Dropout(0.15)(merged)

# TimeDistributed for sequence
# change activation to sigmoid?
output = TimeDistributed(
    Dense(output_dim=nb_classes,
          activation='softmax'))(after_dp)

model = Model(input=sequence, output=output)

# try using different optimizers and different optimizer configs
# loss=binary_crossentropy, optimizer=rmsprop
model.compile(loss='categorical_crossentropy',
              metrics=['accuracy'], optimizer='adam',
              sample_weight_mode='temporal')

print('Train...')
model.fit(X_train, y_train,
          batch_size=batch_size,
          nb_epoch=epochs,
          shuffle=True,
          validation_split=val_split,
          sample_weight=W)

453

asked May 18 '16 18:05

ChrisDH

1 Answers

Solved. The main issue was reshaping the data for the classification categories as binary arrays. Also used TimeDistributed and set return_sequences to True.

answered Nov 16 '22 03:11

ChrisDH

Related questions
                            
                                Is it cheaper to reverse an appended list or to prepend a list? - python
                            
                                Single sign on to Django site via remote Active Directory
                            
                                setuptools finds wrong package during install
                            
                                celery beat schedule: run task instantly when start celery beat?
                            
                                Python Logging with a common logger class mixin and class inheritance
                            
                                What is the best way to deal with "_d" suffix for C extensions when using debug build?
                            
                                A more complex version of "How can I tell if a string repeats itself in Python?"
                            
                                reserved keyword is used in protobuf in Python
                            
                                Python psycopg2 cursors
                            
                                Building an animation using Python Gizeh
                            
                                How to Generate Fixtures from Database with SqlAlchemy
                            
                                How can you parallelize a regex search of one long string? [duplicate]
                            
                                Python PIL Image in Label auto resize
                            
                                Python multiprocessing Pool hangs on ubuntu server
                            
                                Probability tree for sentences in nltk employing both lookahead and lookback dependencies
                            
                                Python equivalent of Matlab's clear, close all, clc
                            
                                Calculating Dynamic Time Warping Distance in a Pandas Data Frame
                            
                                xgboost binary logistic regression
                            
                                Fast Interpolation / Resample of Numpy Array - Python
                            
                                Robot Framework test scripts fail with SSLError

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

keras BLSTM for sequence labeling

Tags:

python

neural-network

keras

lstm

ChrisDH

People also ask

1 Answers

ChrisDH

Recent Activity

Donate For Us