Embedding lookup table doesn't mask padding value

Question

I'm using an embedding_lookup operation in order to generate dense vector representations for each token in my document which are feed to a convolutional neural network (the network architecture is similar to the one in a WildML article).

Everything works correctly but when I pad my document inserting a padding value in it, the embedding lookup generates a vector for this token too. I think that this approach could alterate the results in the classification task. What I want to achieve is something similar to what Torch LookupTableMaskZero does.

1) Is correct what I want to do?

2) Is already implemented something like this?

3) If not, how can I mask the padding value in order to prevent the generation of the corresponding vector for it?

Thank you in advance,

Alessandro

allen · Accepted Answer

@Alessandro Suglia I think this feature is useful, unfortunately tf not support right now. One workaround to get the same result but is slower is to lookup twice. like below

  lookup_result = tf.nn.embedding_lookup(emb, index)
  masked_emb = tf.concat(0, [tf.zeros([1, 1]), 
                             tf.ones([emb.get_shape()[0] - 1, 1])
  mask_lookup_result = tf.nn.embedding_lookup(masked_emb, index)
  lookup_result = tf.mul(lookup_result, mask_lookup_result)

Embedding lookup table doesn't mask padding value

Tags:

python

python-3.x

machine-learning

tensorflow

Alessandro Suglia

1 Answers

allen

Recent Activity

Donate For Us

Embedding lookup table doesn't mask padding value

Tags:

python

python-3.x

machine-learning

tensorflow

Alessandro Suglia

1 Answers

allen

Related questions

Recent Activity

Donate For Us