Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Initializing tensorflow Variable with an array larger than 2GB

Tags:

I am trying to initialize a tensorflow Variable with pre-trained word2vec embeddings.

I have the following code:

import tensorflow as tf from gensim import models  model = models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) X = model.syn0  embeddings = tf.Variable(tf.random_uniform(X.shape, minval=-0.1, maxval=0.1), trainable=False)  sess.run(tf.initialize_all_variables())  sess.run(embeddings.assign(X)) 

And I am receiving the following error:

ValueError: Cannot create an Operation with a NodeDef larger than 2GB. 

The array (X) I am trying to assign is of shape (3000000, 300) and its size is 3.6GB.

I am getting the same error if I try tf.convert_to_tensor(X) as well.

I know that it fails due to the fact that the array is larger than 2GB. However, I do not know how to assign an array larger than 2GB to a tensorflow Variable

like image 248
Filip Avatar asked Feb 14 '16 16:02

Filip


People also ask

How do you initialize a TensorFlow variable?

To initialize a new variable from the value of another variable use the other variable's initialized_value() property. You can use the initialized value directly as the initial value for the new variable, or you can use it as any other tensor to compute a value for the new variable.

How do you initialize a TensorFlow variable in a matrix?

First, remember that you can use the TensorFlow eye functionality to easily create a square identity matrix. We create a 5x5 identity matrix with a data type of float32 and assign it to the Python variable identity matrix. So we used tf. eye, give it a size of 5, and the data type is float32.

How do I assign a value in TensorFlow?

Tensorflow variables represent the tensors whose values can be changed by running operations on them. The assign() is the method available in the Variable class which is used to assign the new tf. Tensor to the variable. The new value must have the same shape and dtype as the old Variable value.

Is tf variable trainable?

New!


2 Answers

It seems like the only option is to use a placeholder. The cleanest way I can find is to initialize to a placeholder directly:

X_init = tf.placeholder(tf.float32, shape=(3000000, 300)) X = tf.Variable(X_init) # The rest of the setup... sess.run(tf.initialize_all_variables(), feed_dict={X_init: model.syn0}) 
like image 57
Joshua Little Avatar answered Oct 19 '22 00:10

Joshua Little


The easiest solution is to feed_dict'ing it into a placeholder node that you use to tf.assign to the variable.

X = tf.Variable([0.0]) place = tf.placeholder(tf.float32, shape=(3000000, 300)) set_x = X.assign(place) # set up your session here.... sess.run(set_x, feed_dict={place: model.syn0}) 

As Joshua Little noted in a separate answer, you can also use it in the initializer:

X = tf.Variable(place)    # place as defined above ... init = tf.initialize_all_variables() ... create sess ... sess.run(init, feed_dict={place: model.syn0}) 
like image 25
dga Avatar answered Oct 19 '22 00:10

dga