Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"ValueError: Trying to share variable $var, but specified dtype float32 and found dtype float64_ref" when trying to use get_variable

I am trying to build a custom variational autoencoder network, where in I'm initializing the decoder weights using the transpose of the weights from the encoder layer, I couldn't find something native to tf.contrib.layers.fully_connected so I used tf.assign instead, here's my code for the layers:

def inference_network(inputs, hidden_units, n_outputs):
    """Layer definition for the encoder layer."""
    net = inputs
    with tf.variable_scope('inference_network', reuse=tf.AUTO_REUSE):
        for layer_idx, hidden_dim in enumerate(hidden_units):
            net = layers.fully_connected(
                net,
                num_outputs=hidden_dim,
                weights_regularizer=layers.l2_regularizer(training_params.weight_decay),
                scope='inf_layer_{}'.format(layer_idx))
            add_layer_summary(net)
        z_mean = layers.fully_connected(net, num_outputs=n_outputs, activation_fn=None)
        z_log_sigma = layers.fully_connected(
            net, num_outputs=n_outputs, activation_fn=None)

    return z_mean, z_log_sigma


def generation_network(inputs, decoder_units, n_x):
    """Define the decoder network."""
    net = inputs  # inputs here is the latent representation.
    with tf.variable_scope("generation_network", reuse=tf.AUTO_REUSE):
        assert(len(decoder_units) >= 2)
        # First layer does not have a regularizer
        net = layers.fully_connected(
            net,
            decoder_units[0],
            scope="gen_layer_0",
        )
        for idx, decoder_unit in enumerate([decoder_units[1], n_x], 1):
            net = layers.fully_connected(
                net,
                decoder_unit,
                scope="gen_layer_{}".format(idx),
                weights_regularizer=layers.l2_regularizer(training_params.weight_decay)
            )
    # Assign the transpose of weights to the respective layers
    tf.assign(tf.get_variable("generation_network/gen_layer_1/weights"),
              tf.transpose(tf.get_variable("inference_network/inf_layer_1/weights")))
    tf.assign(tf.get_variable("generation_network/gen_layer_1/bias"),
              tf.get_variable("generation_network/inf_layer_0/bias"))
    tf.assign(tf.get_variable("generation_network/gen_layer_2/weights"),
              tf.transpose(tf.get_variable("inference_network/inf_layer_0/weights")))
    return net # x_recon

It is wrapped using this tf.slim arg_scope:

def _autoencoder_arg_scope(activation_fn):
    """Create an argument scope for the network based on its parameters."""

    with slim.arg_scope([layers.fully_connected],
                        weights_initializer=layers.xavier_initializer(),
                        biases_initializer=tf.initializers.constant(0.0),
                        activation_fn=activation_fn) as arg_sc:
        return arg_sc

However I'm getting the error: ValueError: Trying to share variable VarAutoEnc/generation_network/gen_layer_1/weights, but specified dtype float32 and found dtype float64_ref. I have narrowed this down to the get_variablecall, but I don't know why it's failing.

If there is a way where you can initialize a tf.contrib.layers.fully_connected from another fully connected layer without a tf.assign operation, that solution is fine with me.

like image 577
rootavish Avatar asked May 29 '18 15:05

rootavish


1 Answers

I can't reproduce your error. Here is a minimalistic runnable example that does the same as your code:

import tensorflow as tf

with tf.contrib.slim.arg_scope([tf.contrib.layers.fully_connected],
                               weights_initializer=tf.contrib.layers.xavier_initializer(),
                               biases_initializer=tf.initializers.constant(0.0)):

  i = tf.placeholder(tf.float32, [1, 30])

  with tf.variable_scope("inference_network", reuse=tf.AUTO_REUSE):
    tf.contrib.layers.fully_connected(i, 30, scope="gen_layer_0")

  with tf.variable_scope("generation_network", reuse=tf.AUTO_REUSE):
    tf.contrib.layers.fully_connected(i, 30, scope="gen_layer_0",
      weights_regularizer=tf.contrib.layers.l2_regularizer(0.01))

  with tf.variable_scope("", reuse=tf.AUTO_REUSE):
    tf.assign(tf.get_variable("generation_network/gen_layer_0/weights"),
              tf.transpose(tf.get_variable("inference_network/gen_layer_0/weights")))

The code runs without a ValueError. If you get a ValueError running this, then it is probably a bug that has been fixed in a later tensorflow version (I tested on 1.9). Otherwise the error is part of your code that you don't show in the question.

By the way, assign will return an op that will perform the assignment once the returned op is run in a session. So you will want to return the output of all assign calls in the generation_network function. You can bundle all assign ops into one using tf.group.

like image 148
BlueSun Avatar answered Nov 15 '22 00:11

BlueSun