Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Feeding parameters into placeholders in tensorflow

I'm trying to get into tensorflow, setting up a network and then feeding data to it. For some reason I end up with the error message ValueError: setting an array element with a sequence. I made a minimal example of what I'm trying to do:

import tensorflow as tf
K = 10

lchild = tf.placeholder(tf.float32, shape=(K))
rchild = tf.placeholder(tf.float32, shape=(K))
parent = tf.nn.tanh(tf.add(lchild, rchild))

input = [ tf.Variable(tf.random_normal([K])),
          tf.Variable(tf.random_normal([K])) ]

with tf.Session() as sess :
    print(sess.run([parent], feed_dict={ lchild: input[0], rchild: input[1] }))

Basically, I'm setting up a network with place holders and a sequence of input embeddings that I want to learn, and then I try to run the network, feeding the input embeddings into it. From what I can tell by searching for the error message, there might be something wrong with my feed_dict, but I can't see any obvious mismatches in eg. dimensionality.

So, what did I miss, or how did I get this completely backwards?

EDIT: I've edited the above to clarify that the input represents embeddings that need to be learned. I guess the question can be asked more sharply as: Is it possible to use placeholders for parameters?

like image 258
masaers Avatar asked Mar 13 '23 21:03

masaers


1 Answers

The inputs should be numpy arrays.

So, instead of tf.Variable(tf.random_normal([K])), simply write np.random.randn(K) and everything should work as expected.

EDIT (The question was clarified after my answer):

It is possible to use placeholders as parameters but in a slightly different way. For example:

lchild = tf.placeholder(tf.float32, shape=(K))
rchild = tf.placeholder(tf.float32, shape=(K))
parent = tf.nn.tanh(tf.add(lchild, rchild))
loss = <some loss that depends on the parent tensor or lchild/rchild>
# Compute gradients with respect to the input variables
grads = tf.gradients(loss, [lchild, rchild])

inputs = [np.random.randn(K), np.random.randn(K)]
for i in range(<number of iterations>):
    np_grads = sess.run(grads, feed_dict={lchild:inputs[0], rchild:inputs[1])
    inputs[0] -= 0.1 * np_grads[0]
    inputs[1] -= 0.1 * np_grads[1]

It is not however the best or easiest way to do this. The main problem with it is that at every iteration you need to copy numpy arrays in and out of the session (which is running potentially on a different device like GPU).

Placeholders generally are used to feed the data external to the model (like texts or images). The way to solve it using tensorflow utilities would be something like:

lchild = tf.Variable(tf.random_normal([K])
rchild = tf.Variable(tf.random_normal([K])
parent = tf.nn.tanh(tf.add(lchild, rchild))
loss = <some loss that depends on the parent tensor or lchild/rchild>
train_op = tf.train.GradientDescentOptimizer(loss).minimize(0.1)

for i in range(<number of iterations>):
    sess.run(train_op)

# Retrieve the weights back to numpy:
np_lchild = sess.run(lchild)
like image 162
Rafał Józefowicz Avatar answered Mar 23 '23 12:03

Rafał Józefowicz