I was trying to port CRNN model to Keras. But, I got stuck while connecting output of Conv2D layer to LSTM layer. Output from CNN layer will have a shape of ( batch_size, 512, 1, width_dash) where first one depends on batch_size, and last one depends on input width of input ( this model can accept variable width input ) For eg: an input with shape [2, 1, 32, 829] was resulting output with shape of (2, 512, 1, 208) Now, as per Pytorch model, we have to do squeeze(2) followed by permute(2, 0, 1) it will result a tensor with shape [208, 2, 512 ] I was trying to implement this is Keras, but I was not able to do that because, in Keras we can not alter batch_size dimension in a keras.models.Sequential model Can someone please guide me how to port above part of this model to Keras? Current state of ported CNN layer

You don't need to permute the batch axis in Keras. In a pytorch model you need to do it because a pytorch LSTM expects an input shape <code>(seq_len, batch, input_size)</code>. However in Keras, the <code>LSTM</code> layer expects <code>(batch, seq_len, input_size)</code>. So after defining the CNN and squeezing out axis 2, you just need to permute the last two axes. As a simple example (in <code>'channels_first'</code> Keras image format), <pre class="prettyprint lang-py prettyprint-override"><code>model = Sequential() model.add(Conv2D(512, 3, strides=(32, 4), padding='same', input_shape=(1, 32, None))) model.add(Reshape((512, -1))) model.add(Permute((2, 1))) model.add(LSTM(32)) </code></pre> You can verify the shapes with <code>model.summary()</code>: <pre class="prettyprint"><code>_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_4 (Conv2D) (None, 512, 1, None) 5120 _________________________________________________________________ reshape_3 (Reshape) (None, 512, None) 0 _________________________________________________________________ permute_4 (Permute) (None, None, 512) 0 _________________________________________________________________ lstm_3 (LSTM) (None, 32) 69760 ================================================================= Total params: 74,880 Trainable params: 74,880 Non-trainable params: 0 _________________________________________________________________ </code></pre>

How to model Convolutional recurrent network ( CRNN ) in Keras

Tags:

keras

lstm

pytorch

recurrent-neural-network

I was trying to port CRNN model to Keras.

But, I got stuck while connecting output of Conv2D layer to LSTM layer.

Output from CNN layer will have a shape of ( batch_size, 512, 1, width_dash) where first one depends on batch_size, and last one depends on input width of input ( this model can accept variable width input )

For eg: an input with shape [2, 1, 32, 829] was resulting output with shape of (2, 512, 1, 208)

Now, as per Pytorch model, we have to do squeeze(2) followed by permute(2, 0, 1) it will result a tensor with shape [208, 2, 512 ]

I was trying to implement this is Keras, but I was not able to do that because, in Keras we can not alter batch_size dimension in a keras.models.Sequential model

Can someone please guide me how to port above part of this model to Keras?

Current state of ported CNN layer

418

asked Jan 20 '18 13:01

harish2704

1 Answers

You don't need to permute the batch axis in Keras. In a pytorch model you need to do it because a pytorch LSTM expects an input shape (seq_len, batch, input_size). However in Keras, the LSTM layer expects (batch, seq_len, input_size).

So after defining the CNN and squeezing out axis 2, you just need to permute the last two axes. As a simple example (in 'channels_first' Keras image format),

model = Sequential()
model.add(Conv2D(512, 3, strides=(32, 4), padding='same', input_shape=(1, 32, None)))
model.add(Reshape((512, -1)))
model.add(Permute((2, 1)))
model.add(LSTM(32))

You can verify the shapes with model.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d_4 (Conv2D)            (None, 512, 1, None)      5120
_________________________________________________________________
reshape_3 (Reshape)          (None, 512, None)         0
_________________________________________________________________
permute_4 (Permute)          (None, None, 512)         0
_________________________________________________________________
lstm_3 (LSTM)                (None, 32)                69760
=================================================================
Total params: 74,880
Trainable params: 74,880
Non-trainable params: 0
_________________________________________________________________

answered Oct 07 '22 07:10

Yu-Yang

Related questions
                            
                                what is the difference between using softmax as a sequential layer in tf.keras and softmax as an activation function for a dense layer?
                            
                                AssertionError: Tried to export a function which references untracked resource
                            
                                Neural network generating incorrect results that are around the average of outputs
                            
                                Keras/Tensorflow predict: error in array shape
                            
                                Installing Keras package with conda install
                            
                                scikit-learn - Convert pipeline prediction to original value/scale
                            
                                How to code a sequence to sequence RNN in keras?
                            
                                How does Keras evaluate loss on test set?
                            
                                how to initialize layers by numpy array in keras
                            
                                How can I add orthogonality regularization in Keras?
                            
                                How can I use tensorflow metric function within keras models?
                            
                                VGG, perceptual loss in keras
                            
                                concatenate (merge) layer keras with tensorflow
                            
                                Keras -- Input Shape for Embedding Layer
                            
                                How to verify structure a neural network in keras model?
                            
                                Keras model output information/log level
                            
                                element-wise multiplication with broadcasting in keras custom layer
                            
                                how to save val_loss and val_acc in Keras
                            
                                Issue with Keras backend flatten
                            
                                keras + scikit-learn wrapper, appears to hang when GridSearchCV with n_jobs >1

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With