Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keras bidirectional lstm seq2seq

Tags:

python

keras

lstm

I am trying to modify the lstm_seq2seq.py example of keras, to modify it to a bidirectional lstm model.

https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py

I try different approaches:

  • the first one was to directly apply the Bidirectional wraper to the LSTM layer:

    encoder_inputs = Input(shape=(None, num_encoder_tokens))
    encoder = Bidirectional(LSTM(latent_dim, return_state=True))
    

but I got this error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-76-a80f8554ab09> in <module>()
     75 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
     76 
---> 77 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
     78 # We discard `encoder_outputs` and only keep the states.
     79 encoder_states = [state_h, state_c]

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
    601 
    602             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603             output = self.call(inputs, **kwargs)
    604             output_mask = self.compute_mask(inputs, previous_mask)
    605 

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
    293             y_rev = K.reverse(y_rev, 1)
    294         if self.merge_mode == 'concat':
--> 295             output = K.concatenate([y, y_rev])
    296         elif self.merge_mode == 'sum':
    297             output = y + y_rev

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
   1757     """
   1758     if axis < 0:
-> 1759         rank = ndim(tensors[0])
   1760         if rank:
   1761             axis %= rank

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
    597     ```
    598     """
--> 599     dims = x.get_shape()._dims
    600     if dims is not None:
    601         return len(dims)

AttributeError: 'list' object has no attribute 'get_shape'
  • my second guess was to modify the input to have something like in https://github.com/keras-team/keras/blob/master/examples/imdb_bidirectional_lstm.py :

    encoder_input_data = np.empty(len(input_texts), dtype=object)
    decoder_input_data = np.empty(len(input_texts), dtype=object)
    decoder_target_data = np.empty(len(input_texts), dtype=object)
    
    for i, (input_text, target_text) in enumerate(zip(input_texts, target_texts)):
        encoder_input_data[i] = [input_token_index[char] for char in input_text]
        tseq = [target_token_index[char] for char in target_text]
        decoder_input_data[i] = tseq
        decoder_output_data[i] = tseq[1:]
    
    encoder_input_data = sequence.pad_sequences(encoder_input_data, maxlen=max_encoder_seq_length)
    decoder_input_data = sequence.pad_sequences(decoder_input_data, maxlen=max_decoder_seq_length)
    decoder_target_data = sequence.pad_sequences(decoder_target_data, maxlen=max_decoder_seq_length)
    

but I got the same error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-75-474b2515be72> in <module>()
     73 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
     74 
---> 75 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
     76 # We discard `encoder_outputs` and only keep the states.
     77 encoder_states = [state_h, state_c]

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
    601 
    602             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603             output = self.call(inputs, **kwargs)
    604             output_mask = self.compute_mask(inputs, previous_mask)
    605 

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
    293             y_rev = K.reverse(y_rev, 1)
    294         if self.merge_mode == 'concat':
--> 295             output = K.concatenate([y, y_rev])
    296         elif self.merge_mode == 'sum':
    297             output = y + y_rev

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
   1757     """
   1758     if axis < 0:
-> 1759         rank = ndim(tensors[0])
   1760         if rank:
   1761             axis %= rank

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
    597     ```
    598     """
--> 599     dims = x.get_shape()._dims
    600     if dims is not None:
    601         return len(dims)

AttributeError: 'list' object has no attribute 'get_shape'

Any help? Thanks

(The code: https://gist.github.com/anonymous/c0fd6541ab4fc9c2c1e0b86175fb65c7 )

like image 401
JJ E. D. Avatar asked Dec 21 '17 10:12

JJ E. D.


Video Answer


1 Answers

The error you're seeing is because the Bidirectional wrapper does not handle the state tensors properly. I've fixed it in this PR, and it's in the latest 2.1.3 release already. So the lines in the question should work now if you upgrade your Keras to the latest version.

Note that the returned value from Bidirectional(LSTM(..., return_state=True)) is a list containing:

  1. Layer output
  2. States (h, c) of the forward layer
  3. States (h, c) of the backward layer

So you may need to merge the state tensors before passing them to the decoder (which is usually unidirectional, I suppose). For example, if you choose to concatenate the states,

encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = Bidirectional(LSTM(latent_dim, return_state=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c = encoder(encoder_inputs)

state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]

decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim * 2, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
like image 133
Yu-Yang Avatar answered Oct 09 '22 20:10

Yu-Yang