I'm not sure if this is possible in Tensorflow and I'm concerned I may have to switch over to PyTorch.
Basically, I have this layer:
self.policy_conv1 = tf.layers.conv2d(inputs=self.policy_s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)
Which I'm trying to copy into another layer every 100 iterations of training or so:
self.eval_conv1 = tf.layers.conv2d(inputs=self.s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid', activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)
tf.assign
doesn't seem to be the right tool, and the following doesn't seem to work:
self.policy_conv1 = tf.stop_gradient(tf.identity(self.eval_conv1))
Essentially, I am looking to copy over the eval conv layer into the policy conv layer, and not have them tied together each time the graph runs one variable or the other (which is occurring with the identity snippet above). If someone can point me to the needed code, I would appreciate it.
import numpy as np
import tensorflow as tf
# I'm using placeholders, but it'll work for other inputs as well
ph1 = tf.placeholder(tf.float32, [None, 32, 32, 3])
ph2 = tf.placeholder(tf.float32, [None, 32, 32, 3])
l1 = tf.layers.conv2d(inputs=ph1, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_1")
l2 = tf.layers.conv2d(inputs=ph2, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_2")
sess = tf.Session()
sess.run(tf.global_variables_initializer())
w1 = tf.get_default_graph().get_tensor_by_name("layer_1/kernel:0")
w2 = tf.get_default_graph().get_tensor_by_name("layer_2/kernel:0")
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero
sess.run(tf.assign(w2, w1))
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # 0
w1 = w1 * 2 + 1
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero
layer_1/bias:0
should work for getting the bias terms.
UPDATE:
I found an easier way:
update_weights = [tf.assign(new, old) for (new, old) in
zip(tf.trainable_variables('new_scope'), tf.trainable_vars('old_scope'))]
Doing a sess.run
on update_weights
should copy the weights from one network to the other. Just remember to build them under separate name scopes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With