Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow: How to copy conv layer weights to another variable for use in reinforcement learning?

I'm not sure if this is possible in Tensorflow and I'm concerned I may have to switch over to PyTorch.

Basically, I have this layer:

self.policy_conv1 = tf.layers.conv2d(inputs=self.policy_s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)

Which I'm trying to copy into another layer every 100 iterations of training or so:

self.eval_conv1 = tf.layers.conv2d(inputs=self.s, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid', activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer)

tf.assign doesn't seem to be the right tool, and the following doesn't seem to work:

self.policy_conv1 = tf.stop_gradient(tf.identity(self.eval_conv1))

Essentially, I am looking to copy over the eval conv layer into the policy conv layer, and not have them tied together each time the graph runs one variable or the other (which is occurring with the identity snippet above). If someone can point me to the needed code, I would appreciate it.

like image 558
andrew Avatar asked Mar 05 '23 19:03

andrew


1 Answers

import numpy as np
import tensorflow as tf

# I'm using placeholders, but it'll work for other inputs as well
ph1 = tf.placeholder(tf.float32, [None, 32, 32, 3])
ph2 = tf.placeholder(tf.float32, [None, 32, 32, 3])

l1 = tf.layers.conv2d(inputs=ph1, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_1")
l2 = tf.layers.conv2d(inputs=ph2, filters=16, kernel_size=(8,8),strides=(4,4), padding = 'valid',activation=tf.nn.relu, kernel_initializer=tf.glorot_uniform_initializer, bias_initializer = tf.glorot_uniform_initializer, name="layer_2")

sess = tf.Session()
sess.run(tf.global_variables_initializer())

w1 = tf.get_default_graph().get_tensor_by_name("layer_1/kernel:0")
w2 = tf.get_default_graph().get_tensor_by_name("layer_2/kernel:0")

w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero

sess.run(tf.assign(w2, w1))
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # 0

w1 = w1 * 2 + 1
w1_r = sess.run(w1)
w2_r = sess.run(w2)
print(np.sum(w1_r - w2_r)) # non-zero

layer_1/bias:0 should work for getting the bias terms.

UPDATE:

I found an easier way:

update_weights = [tf.assign(new, old) for (new, old) in 
   zip(tf.trainable_variables('new_scope'), tf.trainable_vars('old_scope'))]

Doing a sess.run on update_weights should copy the weights from one network to the other. Just remember to build them under separate name scopes.

like image 150
squadrick Avatar answered May 09 '23 04:05

squadrick