Embedding vectors not being updated when using Tensorflow on window classification

Question

I am trying to implement a window based classifier with tensorflow,

The word embedding matrix is called word_vec and is initialized randomly (I tried Xavier also).

And the ind variable is the a vector of the indices of the word vectors from the matrix.

The first layer is config['window_size'] (5) word vectors concatenated.

word_vecs = tf.Variable(tf.random_uniform([len(words), config['embed_size']], -1.0, 1.0),dtype=tf.float32)
ind = tf.placeholder(tf.int32,  [None, config['window_size']])
x = tf.concat(1,tf.unpack(tf.nn.embedding_lookup(word_vecs, ind),axis=1))
W0 = tf.Variable(tf.random_uniform([config['window_size']*config['embed_size'], config['hidden_layer']]))
b0 = tf.Variable(tf.zeros([config['hidden_layer']]))
W1 = tf.Variable(tf.random_uniform([config['hidden_layer'], out_layer]))
b1 = tf.Variable(tf.zeros([out_layer]))
y0 = tf.nn.tanh(tf.matmul(x, W0) + b0)
y1 = tf.nn.softmax(tf.matmul(y0, W1) + b1)
y_ = tf.placeholder(tf.float32, [None, out_layer])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y1), reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(0.5).minimize(cross_entropy)

And this is how I run the graph:

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(config['iterations'] ):
    r = random.randint(0,len(sentences)-1)
    inds=generate_windows([w for w,t in sentences[r]])
    #inds now contains an array of n rows on window_size columns
    ys=[one_hot(tags.index(t),len(tags)) for w,t in sentences[r]]
    #ys now contains an array of n rows on output_size columns
    sess.run(train_step, feed_dict={ind: inds, y_: ys})

The dimensions work out, and the code runs

However, the accuracy is near zero, and I suspect that the the word vectors aren't being updated properly.

How can I make tensorflow update the word vectors back from the concatenated window form ?

Kashyap · Accepted Answer

Your embeddings are initialised using tf.Variable which are by default trainable. They will be updated. The problem might be with the way you are calculating loss. Look at these following lines

y1 = tf.nn.softmax(tf.matmul(y0, W1) + b1)
y_ = tf.placeholder(tf.float32, [None, out_layer])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y1), reduction_indices=[1]))

Here you are calculating the softmax function which converts the scores into probabilities

$softmax equation$

If the denominator here becomes too large or too small then this function can go for a toss. To avoid this numerical instability usually a small epsilon is added like below. This makes sure that there is numerical stability.

$softmax_with_epsilon$

You can see that even after adding an epsilon the softmax functions value remains the same. If you don't handle this on your own then the gradients may not update properly due to vanishing or exploding gradients.

Avoid the three lines of code and use the tensorflow version tf.nn.sparse_softmax_cross_entropy_with_logits

Note that this function will calculate the softmax function internally. It is advisable to use this instead of calculating the loss manually. You can use this as follows

y1 = tf.matmul(y0, W1) + b1
y_ = tf.placeholder(tf.float32, [None, out_layer])
cross_entropy = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y1, labels=y_))

Embedding vectors not being updated when using Tensorflow on window classification

Tags:

python

tensorflow

deep-learning

Uri Goren

1 Answers

Kashyap

Recent Activity

Donate For Us

Embedding vectors not being updated when using Tensorflow on window classification

Tags:

python

tensorflow

deep-learning

Uri Goren

1 Answers

Kashyap

Related questions

Recent Activity

Donate For Us