Keras/TF: Time Distributed CNN+LSTM for visual recognition

Tags:

enter image description here

I am trying to implement the Model from the article (https://arxiv.org/abs/1411.4389) that basically consists of time-distributed CNNs followed by a sequence of LSTMs using Keras with TF.

However, I am having a problem trying to figure out if I should include the TimeDirstibuted function just for my Convolutional & Pooling Layers or also for the LSTMs?

Is there a way to run the CNN Layers in parallel (Based on the number of frames in the sequence that I want to process and based on the number of cores that I have)?

And Last, suppose that each entry is composed of "n" frames (in sequence) where n varies based on the current data entry, what is the best suitable input dimension? and would "n" be the batch size? Is there a way to limit the number of CNNs in // to for example 4 (so that you get an output Y after 4 frames are processed)?

P.S.: The inputs are small videos (i.e. a sequence of frames)

P.S.: The output dimension is irrelevant to my question, so it is not discussed here

Thank you

477

asked Jun 27 '17 10:06

charbelfa

1 Answers

[Edited]
Sorry, only-a-link-answer was bad. So I try to answer question one by one.

if I should include the TimeDirstibuted function just for my Convolutional & Pooling Layers or also for the LSTMs?

Use TimeDistributed function only for Conv and Pooling layers, no need for LSTMs.

Is there a way to run the CNN Layers in parallel?

No, if you use CPU. It's possible if you utilize GPU.
Transparent Multi-GPU Training on TensorFlow with Keras

what is the best suitable input dimension?

Five. (batch, time, width, height, channel).

Is there a way to limit the number of CNNs in // to for example 4

You can do this in the preprocess by manually aligning frames into a specific number, not in the network. In other words, "time" dimension should be 4 if you want to have output after 4 frames are processed.

model = Sequential()

model.add(
    TimeDistributed(
        Conv2D(64, (3, 3), activation='relu'), 
        input_shape=(data.num_frames, data.width, data.height, 1)
    )
)
model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(1, 1))))

model.add(TimeDistributed(Conv2D(128, (4,4), activation='relu')))
model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

model.add(TimeDistributed(Conv2D(256, (4,4), activation='relu')))
model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))

# extract features and dropout 
model.add(TimeDistributed(Flatten()))
model.add(Dropout(0.5))

# input to LSTM
model.add(LSTM(256, return_sequences=False, dropout=0.5))

# classifier with sigmoid activation for multilabel
model.add(Dense(data.num_classes, activation='sigmoid'))

Reference:
PRI-MATRIX FACTORIZATION - BENCHMARK

151

answered Oct 24 '22 19:10

teru

Related questions
                            
                                Overwriting Nan values with .loc in Pandas [duplicate]
                            
                                the difference between multiprocessing.sharedctypes.Value and multiprocessing.Value in python
                            
                                How to use created variable in same assign function with pandas
                            
                                How do I get a files absolute path after being uploaded in Django?
                            
                                Getting more than 100 search results with PRAW?
                            
                                Displaying opencv image using python flask
                            
                                Prime numbers generator explanation? [duplicate]
                            
                                Uncomfortable output of mode() in pandas Dataframe
                            
                                Hard coding confidence interval as whiskers in bar plot
                            
                                matplotlib funcanimation update function is called twice for first argument
                            
                                Many to many sequence prediction with different sequence length
                            
                                How do you scale a design resolution to other resolutions with Pygame?
                            
                                How to get indexes of k maximum values from a numpy multidimensional array
                            
                                Python 3 hash HMAC-SHA512 [duplicate]
                            
                                How to build Python 3.4.6 from source?
                            
                                Any way to do integer division in sympy?
                            
                                Save user input after certain message telegram bot
                            
                                How to apply multiple functions to a groupby object
                            
                                string variable as latex in pyplot
                            
                                Call a function written in different file from jupyter notebook

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras/TF: Time Distributed CNN+LSTM for visual recognition

Tags:

python

neural-network

tensorflow

deep-learning

keras

charbelfa

People also ask

1 Answers

teru

Recent Activity

Donate For Us