Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow: Performing this loss computation

My question and problem is stated below the two blocks of code.


Loss Function

def loss(labels, logits, sequence_lengths, label_lengths, logit_lengths):    
    scores = []
    for i in xrange(runner.batch_size):
        sequence_length = sequence_lengths[i]
        for j in xrange(length):
            label_length = label_lengths[i, j]
            logit_length = logit_lengths[i, j]

             # get top k indices <==> argmax_k(labels[i, j, 0, :], label_length)
            top_labels = np.argpartition(labels[i, j, 0, :], -label_length)[-label_length:]
            top_logits = np.argpartition(logits[i, j, 0, :], -logit_length)[-logit_length:]

            scores.append(edit_distance(top_labels, top_logits))

    return np.mean(scores)
    
# Levenshtein distance
def edit_distance(s, t):
    n = s.size
    m = t.size
    d = np.zeros((n+1, m+1))
    d[:, 0] = np.arrange(n+1)
    d[0, :] = np.arrange(n+1)

    for j in xrange(1, m+1):
        for i in xrange(1, n+1):
            if s[i] == t[j]:
                d[i, j] = d[i-1, j-1]
            else:
                d[i, j] = min(d[i-1, j] + 1,
                              d[i, j-1] + 1,
                              d[i-1, j-1] + 1)

    return d[m, n]

Being used in

I've tried to flatten my code so that everything is happening in one place. Let me know if there are typos/points of confusion.

sequence_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size))
labels_placeholder = tf.placeholder(tf.float32, shape=(batch_size, max_feature_length, label_size))
label_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size, max_feature_length))
loss_placeholder = tf.placeholder(tf.float32, shape=(1))

logit_W = tf.Variable(tf.zeros([lstm_units, label_size]))
logit_b = tf.Variable(tf.zeros([label_size]))

length_W = tf.Variable(tf.zeros([lstm_units, max_length]))
length_b = tf.Variable(tf.zeros([max_length]))

lstm = rnn_cell.BasicLSTMCell(lstm_units)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * layer_count)

rnn_out, state = rnn.rnn(stacked_lstm, features, dtype=tf.float32, sequence_length=sequence_lengths_placeholder)

logits = tf.concat(1, [tf.reshape(tf.matmul(t, logit_W) + logit_b, [batch_size, 1, 2, label_size]) for t in rnn_out])

logit_lengths = tf.concat(1, [tf.reshape(tf.matmul(t, length_W) + length_b, [batch_size, 1, max_length]) for t in rnn_out])

optimizer = tf.train.AdamOptimizer(learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss_placeholder, global_step=global_step)

...
...
# Inside training loop

np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths = sess.run([labels_placeholder, logits, sequence_lengths_placeholder, label_lengths_placeholder, logit_lengths], feed_dict=feed_dict)
loss = loss(np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths)
_ = sess.run([train_op], feed_dict={loss_placeholder: loss})

My issue

The issue is that this is returning the error:

  File "runner.py", line 63, in <module>
    train_op = optimizer.minimize(loss_placeholder, global_step=global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 188, in minimize
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 277, in apply_gradients
    (grads_and_vars,))

  ValueError: No gradients provided for any variable: <all my variables>

So I assume that this is TensorFlow complaining that it can't compute the gradients of my loss because the loss is performed by numpy, outside the scope of TF.

So naturally to fix that I would try and implement this in TensorFlow. The issue is, my logit_lengths and label_lengths are both Tensors, so when I try and access a single element, I'm returned a Tensor of shape []. This is an issue when I'm trying to use tf.nn.top_k() which takes an Int for its k parameter.

Another issue with that is my label_lengths is a Placeholder and since my loss value need to be defined before the optimizer.minimize(loss) call, I also get an error that says a value needs to be passed for the placeholder.

I'm just wondering how I could try and implement this loss function. Or if I'm missing something obvious.


Edit: After some further reading I see that usually losses like the one I describe are used in validation and in training a surrogate loss that minimizes in the same place as the true loss is used. Does anyone know what surrogate loss is used for an edit distance based scenario like mine?

like image 998
Aidan Gomez Avatar asked Feb 10 '16 20:02

Aidan Gomez


Video Answer


1 Answers

The first thing I would do is to calculate loss using tensorflow instead of numpy. That will allow tensorflow to compute gradients for you, so you will be able to back-propagate, meaning you can minimize the loss.

There is tf.edit_distance(https://www.tensorflow.org/api_docs/python/tf/edit_distance) function in the core library.

So naturally to fix that I would try and implement this in TensorFlow. The issue is, my logit_lengths and label_lengths are both Tensors, so when I try and access a single element, I'm returned a Tensor of shape []. This is an issue when I'm trying to use tf.nn.top_k() which takes an Int for its k parameter.

Could you provide a little bit more details why it is an issue?

like image 143
Exta Avatar answered Oct 15 '22 10:10

Exta