Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

tensorflow scalar summary tags name exception

Tags:

tensorflow

I am trying to learn how to work with tensorflow summary writers by following the HowTo mnist tutorial. That tutorial adds a scalar summary for the loss function. I wrote a loss function in an unusual by building up a regularization term, and I get this exception:

W tensorflow/core/common_runtime/executor.cc:1027] 0x1e9ab70 Compute status: Invalid argument: tags and values not the same shape: [] != [1]
     [[Node: ScalarSummary = ScalarSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](ScalarSummary/tags, loss)]]

The loss function and adding the summary look like

loss = tf.add(modelError, regularizationTerm, name='loss')
tf.scalar_summary(loss.op.name, loss)

and if I build up the regularizationTerm like this

regularizationTerm = tf.Variable(tf.zeros([1], dtype=np.float32), name='regterm')
regularizationTerm +=  tf.mul(2.0, regA)
regularizationTerm +=  tf.mul(3.0, regB)

were regA and regB are tf.Variables previously defined, I get the exception, whereas is I build it up like

regularizationTerm = tf.add(tf.mul(2.0, regA), tf.mul(3.0, regB), name='regterm')

then it works. So I guess I am not setting the name correctly, when I do the += I create a new tensor that is unamed? But why can't I add that into the loss, and then name the loss? That is the only thing I am trying to summarize?

Is there something like += where I can name the output, or preserve the name of the tensor I am modifying?

In case the issue is related to something else, here is my simple example where I identified the problem:

import numpy as np
import tensorflow as tf

def main():
    x_input = tf.placeholder(tf.float32, shape=(None, 1))
    y_output = tf.placeholder(tf.float32, shape=(None, 1))

    hidden_weights = tf.Variable(tf.truncated_normal([1,10], stddev=0.1), name='weights')
    output_weights = tf.Variable(tf.truncated_normal([10,1], stddev=0.1), name='output')
    inference = tf.matmul(tf.matmul(x_input, hidden_weights), output_weights)
    regA = tf.reduce_sum(tf.pow(hidden_weights, 2))
    regB = tf.reduce_sum(tf.pow(output_weights, 2))
    modelError = tf.reduce_mean(tf.pow(tf.sub(inference,y_output),2), name='model-error')

    fail = True
    if fail:
        regularizationTerm = tf.Variable(tf.zeros([1], dtype=np.float32), name='regterm')
        regularizationTerm +=  tf.mul(2.0, regA)
        regularizationTerm +=  tf.mul(3.0, regB)
    else:
        regularizationTerm = tf.add(tf.mul(2.0, regA), tf.mul(3.0, regB), name='regterm')

    loss = tf.add(modelError, regularizationTerm, name='loss')
    tf.scalar_summary(loss.op.name, loss)
    optimizer = tf.train.GradientDescentOptimizer(0.05)
    global_step = tf.Variable(0, name='global_step', trainable=False)
    train_op = optimizer.minimize(loss, global_step=global_step)

    summary_op = tf.merge_all_summaries()

    saver = tf.train.Saver()

    sess = tf.Session()
    init = tf.initialize_all_variables()
    sess.run(init)

    summary_writer = tf.train.SummaryWriter('train_dir', 
                                            graph_def=sess.graph_def)

    feed_dict = {x_input:np.ones((30,1), dtype=np.float32),
                 y_output:np.ones((30,1), dtype=np.float32)}

    for step in xrange(1000):
        _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict)
        if step % 100 == 0:
            print( "step=%d loss=%.2f" % (step, loss_value))
            summary_str = sess.run(summary_op, feed_dict=feed_dict)
            summary_writer.add_summary(summary_str, step)

if __name__ == '__main__':
    main()
like image 725
MrCartoonology Avatar asked Jan 31 '16 20:01

MrCartoonology


1 Answers

TL;DR: The problem is the shape of the argument to tf.scalar_summary(), not the names.

I think the problem is a shape-related issue, stemming from this line:

regularizationTerm = tf.Variable(tf.zeros([1], dtype=np.float32), name='regterm')

This defines a variable whose shape is a vector of length 1. The subsequent += operators (which are syntactic sugar for tf.add()) and the tf.add() to compute loss will produce vector-shaped results, because tf.add() broadcasts the scalar argument to become a vector. Finally, tf.scalar_summary() expects its two arguments to have the same shape—unlike the broadcasting add, tf.scalar_summary() is not permissive about the shapes of its inputs. The tags input is a scalar string (the name of the loss op) whereas the values input is a vector of length one (the value of the loss tensor). Therefore you get the error that you reported.

Fortunately, the solution is simple! Either define the regularizationTerm variable as a scalar, like so:

# Note that `[]` is the scalar shape.
regularizationTerm = tf.Variable(tf.zeros([], dtype=np.float32), name='regterm')

...or pass a vector (of length 1) of strings to tf.scalar_summary():

# Wrap `loss.op.name` in a list to make it a vector.
tf.scalar_summary([loss.op.name], loss)
like image 188
mrry Avatar answered Sep 21 '22 14:09

mrry