I am trying to modify the lstm_seq2seq.py example of keras, to modify it to a bidirectional lstm model.
https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py
I try different approaches:
the first one was to directly apply the Bidirectional wraper to the LSTM layer:
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = Bidirectional(LSTM(latent_dim, return_state=True))
but I got this error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-76-a80f8554ab09> in <module>()
75 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
76
---> 77 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
78 # We discard `encoder_outputs` and only keep the states.
79 encoder_states = [state_h, state_c]
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
601
602 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603 output = self.call(inputs, **kwargs)
604 output_mask = self.compute_mask(inputs, previous_mask)
605
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
293 y_rev = K.reverse(y_rev, 1)
294 if self.merge_mode == 'concat':
--> 295 output = K.concatenate([y, y_rev])
296 elif self.merge_mode == 'sum':
297 output = y + y_rev
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
1757 """
1758 if axis < 0:
-> 1759 rank = ndim(tensors[0])
1760 if rank:
1761 axis %= rank
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
597 ```
598 """
--> 599 dims = x.get_shape()._dims
600 if dims is not None:
601 return len(dims)
AttributeError: 'list' object has no attribute 'get_shape'
my second guess was to modify the input to have something like in https://github.com/keras-team/keras/blob/master/examples/imdb_bidirectional_lstm.py :
encoder_input_data = np.empty(len(input_texts), dtype=object)
decoder_input_data = np.empty(len(input_texts), dtype=object)
decoder_target_data = np.empty(len(input_texts), dtype=object)
for i, (input_text, target_text) in enumerate(zip(input_texts, target_texts)):
encoder_input_data[i] = [input_token_index[char] for char in input_text]
tseq = [target_token_index[char] for char in target_text]
decoder_input_data[i] = tseq
decoder_output_data[i] = tseq[1:]
encoder_input_data = sequence.pad_sequences(encoder_input_data, maxlen=max_encoder_seq_length)
decoder_input_data = sequence.pad_sequences(decoder_input_data, maxlen=max_decoder_seq_length)
decoder_target_data = sequence.pad_sequences(decoder_target_data, maxlen=max_decoder_seq_length)
but I got the same error message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-75-474b2515be72> in <module>()
73 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
74
---> 75 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
76 # We discard `encoder_outputs` and only keep the states.
77 encoder_states = [state_h, state_c]
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
601
602 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603 output = self.call(inputs, **kwargs)
604 output_mask = self.compute_mask(inputs, previous_mask)
605
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
293 y_rev = K.reverse(y_rev, 1)
294 if self.merge_mode == 'concat':
--> 295 output = K.concatenate([y, y_rev])
296 elif self.merge_mode == 'sum':
297 output = y + y_rev
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
1757 """
1758 if axis < 0:
-> 1759 rank = ndim(tensors[0])
1760 if rank:
1761 axis %= rank
/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
597 ```
598 """
--> 599 dims = x.get_shape()._dims
600 if dims is not None:
601 return len(dims)
AttributeError: 'list' object has no attribute 'get_shape'
Any help? Thanks
(The code: https://gist.github.com/anonymous/c0fd6541ab4fc9c2c1e0b86175fb65c7 )
The error you're seeing is because the Bidirectional
wrapper does not handle the state tensors properly. I've fixed it in this PR, and it's in the latest 2.1.3 release already. So the lines in the question should work now if you upgrade your Keras to the latest version.
Note that the returned value from Bidirectional(LSTM(..., return_state=True))
is a list containing:
(h, c)
of the forward layer(h, c)
of the backward layerSo you may need to merge the state tensors before passing them to the decoder (which is usually unidirectional, I suppose). For example, if you choose to concatenate the states,
encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = Bidirectional(LSTM(latent_dim, return_state=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c = encoder(encoder_inputs)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim * 2, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With