Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to correctly use mask_zero=True for Keras Embedding with pre-trained weights?

I am confused about how to format my own pre-trained weights for Keras Embedding layer if I'm also setting mask_zero=True. Here's a concrete toy example.

Suppose I have a vocabulary of 4 words [1,2,3,4] and am using vector weights defined by:

weight[1]=[0.1,0.2]
weight[2]=[0.3,0.4]
weight[3]=[0.5,0.6]
weight[4]=[0.7,0.8]

I want to embed sentences of length up to 5 words, so I have to zero pad them before feeding them into the Embedding layer. I want to mask out the zeros so further layers don't use them.

Reading the Keras docs for Embedding, it says the 0 value can't be in my vocabulary.

mask_zero: Whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

So what I'm confused about is how to construct the weight array for the Embedding layer, since "index 0 cannot be used in the vocabulary." If I build the weight array as

[[0.1,0.2],
 [0.3,0.4],
 [0.5,0.6],
 [0.7,0.8]]

then normally, word 1 would point to index 1, which in this case holds the weights for word 2. Or is it that when you specify mask_zero=True, Keras internally makes it so that word 1 points to index 0? Alternatively, do you just prepend a vector of zeros in index zero, as follows?

[[0.0,0.0],
 [0.1,0.2],
 [0.3,0.4],
 [0.5,0.6],
 [0.7,0.8]]

This second option seems to me to put the zero into the vocabulary. In other words, I'm very confused. Can anyone shed light on this?

like image 463
AstroBen Avatar asked Jul 17 '18 13:07

AstroBen


People also ask

What is Mask_zero in embedding?

embeddings_constraint: Constraint function applied to the embeddings matrix (see keras. constraints ). mask_zero: Boolean, whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input.

Is embedding layer in Keras trainable?

Embedding layer is one of the available layers in Keras. This is mainly used in Natural Language Processing related applications such as language modeling, but it can also be used with other tasks that involve neural networks. While dealing with NLP problems, we can use pre-trained word embeddings such as GloVe.


1 Answers

You're second approach is correct. You will want to construct your embedding layer in the following way

embedding = Embedding(
   output_dim=embedding_size,
   input_dim=vocabulary_size + 1,
   input_length=input_length,
   mask_zero=True,
   weights=[np.vstack((np.zeros((1, embedding_size)),
                       embedding_matrix))],
   name='embedding'
)(input_layer)

where embedding_matrix is the second matrix you provided.

You can see this by looking at the implementation of keras' embedding layer. Notably, how mask_zero is only used to literally mask the inputs

def compute_mask(self, inputs, mask=None):
    if not self.mask_zero:
        return None
    output_mask = K.not_equal(inputs, 0)
    return output_mask

thus the entire kernel is still multiplied by the input, meaning all indexes are shifted up by one.

like image 115
modesitt Avatar answered Oct 21 '22 11:10

modesitt