How to configure input shape for bidirectional LSTM in Keras

Tags:

I'm facing the following issue. I have a large number of documents that I want to encode using a bidirectional LSTM. Each document has a different number of words and word can be thought of as a timestep.

When configuring the bidirectional LSTM we are expected to provide the timeseries length. When I am training the model this value will be different for each batch. Should I choose a number for the timeseries_size which is the biggest document size I will allow? Any documents bigger than this will not be encoded?

Example config:

Bidirectional(LSTM(128, return_sequences=True), input_shape=(timeseries_size, encoding_size))

759

asked Apr 14 '18 17:04

Funzo

1 Answers

This is a well-known problem and it concerns both ordinary and bidirectional RNNs. This discussion on GitHub might help you. In essence, here are the most common options:

A simple solution is to set the timeseries_size to be the max length over the training set and pad the shorter sequences with zeros. Example Keras code. An obvious downside is memory waste if the training set happens to have both very long and very short inputs.
Separate input samples into buckets of different lengths, e.g. a bucket for length <= 16, another bucket for length <= 32, etc. Basically this means training several separate LSTMs for different sets of sentences. This approach (known as bucketing) requires more effort, but currently considered most efficient and is actually used in the state-of-the-art translation engine Tensorflow Neural Machine Translation.

189

answered Sep 20 '22 15:09

Maxim

Related questions
                            
                                GridSearchCV and LogisticRegression raise ValueError: Can't handle mix of continuous and binary
                            
                                In natural language processing (NLP), how do you make an efficient dimension reduction?
                            
                                "scoring must return a number" cross_val_score error in scikit-learn
                            
                                How to format data for the spark mlib kmeans clustering algorithm?
                            
                                When to use train_test_split of scikit learn
                            
                                Online logistic regression model
                            
                                Efficiently extract WikiData entities from text
                            
                                Name Entity Resolution Algorithm
                            
                                How to load local files in Keras?
                            
                                Why the VC dimension of 2D perceptron is 3?
                            
                                Where does keras store its data sets when using a docker container?
                            
                                Affinity Propagation Clustering for Addresses
                            
                                How to save a trained model (Estimator) and Load it back to test it with data in Tensorflow?
                            
                                How to get loss function history using tf.contrib.opt.ScipyOptimizerInterface
                            
                                How to make the weights of an RNN cell untrainable in Tensorflow?
                            
                                How to add sparse vectors after group by, using Spark SQL?
                            
                                tensorflow neural network multi layer perceptron for regression example
                            
                                ValueError at /image/ Tensor Tensor("activation_5/Softmax:0", shape=(?, 4), dtype=float32) is not an element of this graph
                            
                                adjusted fitness in NEAT algorithm
                            
                                I am getting an accuracy of 1.0 every time in neural network

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to configure input shape for bidirectional LSTM in Keras

Tags:

machine-learning

nlp

keras

lstm

rnn

Funzo

People also ask

1 Answers

Maxim

Recent Activity

Donate For Us