Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow: "No gradients provided for any variable" and partial_run

Problem

Using Tensorflow's partial_run() method doesn't work as I expected. I use it towards the bottom of the supplied code, and I believe it's giving me the attached error.

The general flow of data is that I need to get a prediction from the model, use that prediction in some non-tensorflow code (to program a software synthesiser) to then get audio features (MFCCS, RMS, FFT) after playing a midi note, which can be finally passed to the cost function to check how close the predicted patch was to recreating a desired sound supplied as the current example.

Code - omitted preprocessing

# Create the tensorflow graph.
dimension_data_example = generate_examples(1,
                                           midi_note,
                                           midi_velocity,
                                           note_length,
                                           render_length,
                                           engine,
                                           generator,
                                           mfcc_normaliser,
                                           rms_normaliser)

features, parameters = dimension_data_example[0]
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/recurrent_network.ipynb
# Parameters for the tensorflow graph.
learning_rate = 0.001
training_iters = 256
batch_size = 128
display_step = 10
number_hidden_1 = 128
number_hidden_2 = 128

# Network parameters:
# 14 * 181 - (amount of mfccs + rms value) * sample size
number_input = int(features.shape[0])

# 155 - amount of parameters
number_outputs = len(parameters)

x = tf.placeholder("float", [None, number_input])

# Create model
def multilayer_perceptron(x, weights, biases):
    # Hidden layer with RELU activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    # Hidden layer with RELU activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([number_input, number_hidden_1])),
    'h2': tf.Variable(tf.random_normal([number_hidden_1, number_hidden_2])),
    'out': tf.Variable(tf.random_normal([number_hidden_2, number_outputs]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([number_hidden_1])),
    'b2': tf.Variable(tf.random_normal([number_hidden_2])),
    'out': tf.Variable(tf.random_normal([number_outputs]))
}

# Construct model
prediction = multilayer_perceptron(x, weights, biases)

x_original = tf.placeholder("float", [None, number_input])
x_from_y = tf.placeholder("float", [None, number_input])
cost = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(x_original, x_from_y))))
optimiser = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launching the graph
with tf.Session() as sess:

    sess.run(init)
    step = 1

    while step * batch_size < training_iters:

        train_batch = generate_examples(batch_size,
                                        midi_note,
                                        midi_velocity,
                                        note_length,
                                        render_length,
                                        engine,
                                        generator,
                                        mfcc_normaliser,
                                        rms_normaliser)
        split_train = map(list, zip(*train_batch))
        batch_x = split_train[0]

        setup = sess.partial_run_setup([prediction, optimiser],
                                       [x, x_original, x_from_y])

        pred = sess.partial_run(setup, prediction, feed_dict={x: batch_x})

        features_from_prediction = get_features(pred,
                                                midi_note,
                                                midi_velocity,
                                                note_length,
                                                render_length)

        sess.partial_run(setup, optimiser, feed_dict={x_original: batch_x,
                                                      x_from_y: features_from_prediction})

ERROR

Traceback (most recent call last):
  File "model.py", line 255, in <module>
    optimiser = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 276, in minimize
    ([str(v) for _, v in grads_and_vars], loss))
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ['Tensor("Variable/read:0", shape=(2534, 128), dtype=float32)', 'Tensor("Variable_1/read:0", shape=(128, 128), dtype=float32)', 'Tensor("Variable_2/read:0", shape=(128, 155), dtype=float32)', 'Tensor("Variable_3/read:0", shape=(128,), dtype=float32)', 'Tensor("Variable_4/read:0", shape=(128,), dtype=float32)', 'Tensor("Variable_5/read:0", shape=(155,), dtype=float32)'] and loss Tensor("Sqrt:0", shape=(), dtype=float32).
like image 668
Leon Fedden Avatar asked Dec 11 '22 12:12

Leon Fedden


1 Answers

The immediate error you are experiencing:

No gradients provided for any variable, check your graph for ops that do not support gradients, between variables

is because there is no gradient path from your cost to your weights. This is because there are placeholders and calculations happening outside of your graph in between the weights and cost. Thus, there does not exist a path of gradients from cost to the weights.

In other words, think about the setup.

Weights -> prediction -> get_features -> calculate cost.

Now, think about the back-propagation, we can calculate the gradient of the cost, but we have no gradients from the cost to get_features or from get_features to prediction, because get_features isn't part of the graph:

Weights <- prediction <-/- get_features <-/- calculate cost.

So, the weights will never be able to learn. You need to somehow get a path from your cost back to the prediction if you want this setup to work, perhaps simulating the gradient of get_features in a backwards path of your graph. There may be an cleaner way, but I can't think of one right this moment.

Hope that helps!

like image 126
suharshs Avatar answered Dec 13 '22 22:12

suharshs