Tensorflow dynamic RNN (LSTM): how to format input?

Tags:

I have been given some data of this format and the following details:

person1, day1, feature1, feature2, ..., featureN, label
person1, day2, feature1, feature2, ..., featureN, label
...
person1, dayN, feature1, feature2, ..., featureN, label
person2, day1, feature1, feature2, ..., featureN, label
person2, day2, feature1, feature2, ..., featureN, label
...
person2, dayN, feature1, feature2, ..., featureN, label
...

there is always the same number of features but each feature might be a 0 representing nothing
there is a varying amount of days available for each person, e.g. person1 has 20 days of data, person2 has 50

The goal is to predict the label of the person the following day, so the label for dayN+1, either on a per-person basis, or overall (per-person makes more sense to me). I can freely reformat the data (it is not large). Based on the above after some reading I thought a dynamic RNN (LSTM) could work best:

recurrent neural network: because the next day relies on the previous day
lstm: because the model builds up with each day
dynamic: because not all features are present each day

If it does not make sense for the data I have, please stop me here. The question is then:

How to give/format this data for tensorflow/tflearn?

I have looked at this example using tflearn but I do not understand its input format so that I can 'mirror' it to mine. Similarly, have found this post on a very similar question yet it seems like the samples the poster has are not related between each-other as they are in mine. My experience with tensorflow is limited to its get started page.

678

asked Apr 11 '17 09:04

Dimebag

1 Answers

dynamic: because not all features are present each day

You've got the wrong concept of dynamic here. Dynamic RNN in Tensorflow means the graph is dynamically created during execution, but the inputs are always the same size (0 as the lack of a feature should work ok).

Anyways, what you've got here are sequences of varying length (day1 ... day?) of feature vectors (feature1 ... featureN). First, you need a LSTM cell

cell = tf.contrib.rnn.LSTMcell(size)

so you can then create a dynamically unrolled rnn graph using tf.nn.dynamic_rnn. From the docs:

inputs: The RNN inputs.

If time_major == False (default), this must be a Tensor of shape: [batch_size, max_time, ...], or a nested tuple of such elements.

where max_time refers to the input sequence length. Because we're using dynamic_rnn, the sequence length doesn't need to be defined during compile time, so your input placeholder could be:

x = tf.placeholder(tf.float32, shape=(batch_size, None, N))

Which is then fed into the rnn like

outputs, state = tf.nn.dynamic_rnn(cell, x)

Meaning your input data should have the shape (batch_size, seq_length, N). If examples in one batch have varying length, you should pad them with 0-vectors to the max length and pass the appropriate sequence_length parameter to dynamic_rnn

Obviously I've skipped a lot of details, so to fully understand RNNs you should probably read one of the many excellent RNN tutorials, like this one for example.

101

answered Sep 18 '22 14:09

Dzjkb

Related questions
                            
                                scipy convolve2d outputs wrong values
                            
                                Log file to Pandas Dataframe
                            
                                Optional command line arguments
                            
                                Prevent pandas.read_csv from inferring dtypes
                            
                                Pandas str.count
                            
                                Segment tree implementation in Python
                            
                                More efficient way to clean a column of strings and add a new column
                            
                                How to serve an image from google cloud storage using python flask
                            
                                Pandas: create a dataframe from 2D numpy arrays preserving their sequential order
                            
                                Divide list to multiple lists based on elements value
                            
                                Pandas: Dataframe.Drop - ValueError: labels ['id'] not contained in axis
                            
                                Anaconda "failed to create process"
                            
                                Yes/No prompt in Python3 using strtobool
                            
                                How to optimize MAPE code in Python?
                            
                                Non-blocking requests in Sanic framework
                            
                                Don't understand cause of "IndexError: tuple index out of range" when formatting string
                            
                                How to create groups and assign permission during project setup in django?
                            
                                NumPy: calculate cumulative median
                            
                                Prevent deletion of parent row if it's child will be orphaned in SQLAlchemy
                            
                                How should I pass my s3 credentials to Python lambda function on AWS?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow dynamic RNN (LSTM): how to format input?

Tags:

python

tensorflow

lstm

recurrent-neural-network

tflearn

Dimebag

People also ask

1 Answers

Dzjkb

Recent Activity

Donate For Us