Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow: List of Tensors for Cost

I am trying to work with LSTMs in Tensor Flow. I found a tutorial online where a set of sequences is taken in and the objective function is composed of the last output of the LSTM and the known values. However, I would like to have my objective function use information from each output. Specifically, I am trying to have the LSTM learn the set of sequences (i.e. learn all the letters in words in a sentence).:

cell = rnn_cell.BasicLSTMCell(num_units)
inputs = [tf.placeholder(tf.float32,shape=[batch_size,input_size]) for _ in range(seq_len)]
result = [tf.placeholder(tf.float32, shape=[batch_size,input_size]) for _ in range(seq_len)]

W_o = tf.Variable(tf.random_normal([num_units,input_size], stddev=0.01))     
b_o = tf.Variable(tf.random_normal([input_size], stddev=0.01))

outputs, states = rnn.rnn(cell, inputs, dtype=tf.float32)   

losses = []

for i in xrange(len(outputs)):
    final_transformed_val = tf.matmul(outputs[i],W_o) + b_o
    losses.append(tf.nn.softmax(final_transformed_val))

cost = tf.reduce_mean(losses) 

Doing this results in the error:

TypeError: List of Tensors when single Tensor expected

How should I fix this issue? Does the tf.reduce_mean() take in a list of tensor values, or is there some special tensor object that takes them?

like image 833
arizz Avatar asked Jan 02 '16 20:01

arizz


2 Answers

In your code, losses is a Python list. TensorFlow's reduce_mean() expects a single tensor, not a Python list.

losses = tf.reshape(tf.concat(1, losses), [-1, size])

where size is the number of values you're taking a softmax over should do what you want. See concat()

But, one thing I notice in your code that seems a bit odd, is that you have a list of placeholders for your inputs, whereas the code in the TensorFlow Tutorial uses an order 3 tensor for inputs. Your input is a list of order 2 tensors. I recommend looking over the code in the tutorial, because it does almost exactly what you're asking about.

One of the main files in that tutorial is here. In particular, line 139 is where they create their cost. Regarding your input, lines 90 and 91 are where the input and target placeholders are setup. The main takeaway in those 2 lines is that an entire sequence is passed in in a single placeholder rather than with a list of placeholders.

See line 120 in the ptb_word_lm.py file to see where they do their concatenation.

like image 177
Ryan Stout Avatar answered Sep 28 '22 04:09

Ryan Stout


Working example, check notebook:

import tensorflow as tf
from tensorflow.models.rnn import rnn, rnn_cell
print(tf.__version__) 
#> 0.8.0

batch_size  = 2
output_size = input_size  = 2
seq_len     = 10
num_units   = 2

cell = rnn_cell.BasicLSTMCell(num_units)
inputs = [tf.placeholder(tf.float32, shape=[batch_size,input_size ]) for _ in xrange(seq_len)]
result = [tf.placeholder(tf.float32, shape=[batch_size,output_size]) for _ in xrange(seq_len)]

W_o = tf.Variable(tf.random_normal([num_units,input_size], stddev=0.01))     
b_o = tf.Variable(tf.random_normal([input_size],           stddev=0.01))

outputs, states = rnn.rnn(cell, inputs, dtype=tf.float32)   

losses = []

for i in xrange(seq_len):
    final_transformed_val = tf.matmul(outputs[i],W_o) + b_o
    losses.append(tf.squared_difference(result[i],final_transformed_val)) 

losses = tf.reshape(tf.concat(1, losses), [-1, seq_len])
cost = tf.reduce_mean(losses) 

To see this in action, you can feed the graph in a hacky way:

import matplotlib.pyplot as plt
import numpy as np

step = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
sess = tf.InteractiveSession()

sess.run(tf.initialize_all_variables())

costs = []

# EXAMPLE
#  Learn cumsum over each sequence in x
# | t        | 0 | 1 | 2 | 3 | 4  | ...|
# |----------|---|---|---|---|----|----|
# | x[:,0,0] | 1 | 1 | 1 | 1 | 1  | ...|
# | x[:,0,1] | 1 | 1 | 1 | 1 | 1  | ...|
# |          |   |   |   |   |    |    |
# | y[:,0,0] | 1 | 2 | 3 | 4 | 5  | ...|
# | y[:,0,1] | 1 | 2 | 3 | 4 | 5  | ...|

n_iterations = 300
for _ in xrange(n_iterations):
    x  = np.random.uniform(0,1,[seq_len,batch_size,input_size])
    y  = np.cumsum(x,axis=0)

    x_list = {key: value for (key, value) in zip(inputs, x)}
    y_list = {key: value for (key, value) in zip(result, y)}

    err,_ = sess.run([cost, step], feed_dict=dict(x_list.items()+y_list.items()))
    costs.append(err)

plt.plot(costs)
plt.show()

enter image description here

As tensorflow-beginner I have yet to find a unified way/best practices way of handling RNN's but as mentioned above I'm sure this isn't recommended. Liked your script as a very nice intro, thanks for the snippets. Also, there's things going on w.r.g to implementation of scan and RNN-tuple-friendliness right now so be careful

like image 30
ragulpr Avatar answered Sep 28 '22 04:09

ragulpr