Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to model Convolutional recurrent network ( CRNN ) in Keras

I was trying to port CRNN model to Keras.

But, I got stuck while connecting output of Conv2D layer to LSTM layer.

Output from CNN layer will have a shape of ( batch_size, 512, 1, width_dash) where first one depends on batch_size, and last one depends on input width of input ( this model can accept variable width input )

For eg: an input with shape [2, 1, 32, 829] was resulting output with shape of (2, 512, 1, 208)

Now, as per Pytorch model, we have to do squeeze(2) followed by permute(2, 0, 1) it will result a tensor with shape [208, 2, 512 ]

I was trying to implement this is Keras, but I was not able to do that because, in Keras we can not alter batch_size dimension in a keras.models.Sequential model

Can someone please guide me how to port above part of this model to Keras?

Current state of ported CNN layer

like image 418
harish2704 Avatar asked Jan 20 '18 13:01

harish2704


People also ask

Can you combine CNN and RNN?

CNN is combined with RNN to extract the correlation characteristics of different RNN models while RNNs running along the time steps. This new architecture not only has the depth of RNN in the time dimension, but also has the width of the number of temporal data.

What is CRNN model?

CRNN (Convolutional Recurrent Neural Network) model that. feeds every window frame by frame into a recurrent layer and. use the outputs and hidden states of the recurrent units in each. frame for extracting features from the sequential windows.

Is convolutional neural network recurrent?

The Convolutional Recurrent Neural Networks is the combination of two of the most prominent neural networks. The CRNN (convolutional recurrent neural network) involves CNN(convolutional neural network) followed by the RNN(Recurrent neural networks).

What is CRNN used for?

Abstract: We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features.


1 Answers

You don't need to permute the batch axis in Keras. In a pytorch model you need to do it because a pytorch LSTM expects an input shape (seq_len, batch, input_size). However in Keras, the LSTM layer expects (batch, seq_len, input_size).

So after defining the CNN and squeezing out axis 2, you just need to permute the last two axes. As a simple example (in 'channels_first' Keras image format),

model = Sequential()
model.add(Conv2D(512, 3, strides=(32, 4), padding='same', input_shape=(1, 32, None)))
model.add(Reshape((512, -1)))
model.add(Permute((2, 1)))
model.add(LSTM(32))

You can verify the shapes with model.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d_4 (Conv2D)            (None, 512, 1, None)      5120
_________________________________________________________________
reshape_3 (Reshape)          (None, 512, None)         0
_________________________________________________________________
permute_4 (Permute)          (None, None, 512)         0
_________________________________________________________________
lstm_3 (LSTM)                (None, 32)                69760
=================================================================
Total params: 74,880
Trainable params: 74,880
Non-trainable params: 0
_________________________________________________________________
like image 96
Yu-Yang Avatar answered Oct 07 '22 07:10

Yu-Yang