How to correctly use mask_zero=True for Keras Embedding with pre-trained weights?

Tags:

I am confused about how to format my own pre-trained weights for Keras Embedding layer if I'm also setting mask_zero=True. Here's a concrete toy example.

Suppose I have a vocabulary of 4 words [1,2,3,4] and am using vector weights defined by:

weight[1]=[0.1,0.2]
weight[2]=[0.3,0.4]
weight[3]=[0.5,0.6]
weight[4]=[0.7,0.8]

I want to embed sentences of length up to 5 words, so I have to zero pad them before feeding them into the Embedding layer. I want to mask out the zeros so further layers don't use them.

Reading the Keras docs for Embedding, it says the 0 value can't be in my vocabulary.

mask_zero: Whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal size of vocabulary + 1).

So what I'm confused about is how to construct the weight array for the Embedding layer, since "index 0 cannot be used in the vocabulary." If I build the weight array as

[[0.1,0.2],
 [0.3,0.4],
 [0.5,0.6],
 [0.7,0.8]]

then normally, word 1 would point to index 1, which in this case holds the weights for word 2. Or is it that when you specify mask_zero=True, Keras internally makes it so that word 1 points to index 0? Alternatively, do you just prepend a vector of zeros in index zero, as follows?

[[0.0,0.0],
 [0.1,0.2],
 [0.3,0.4],
 [0.5,0.6],
 [0.7,0.8]]

This second option seems to me to put the zero into the vocabulary. In other words, I'm very confused. Can anyone shed light on this?

463

asked Jul 17 '18 13:07

AstroBen

1 Answers

You're second approach is correct. You will want to construct your embedding layer in the following way

embedding = Embedding(
   output_dim=embedding_size,
   input_dim=vocabulary_size + 1,
   input_length=input_length,
   mask_zero=True,
   weights=[np.vstack((np.zeros((1, embedding_size)),
                       embedding_matrix))],
   name='embedding'
)(input_layer)

where embedding_matrix is the second matrix you provided.

You can see this by looking at the implementation of keras' embedding layer. Notably, how mask_zero is only used to literally mask the inputs

def compute_mask(self, inputs, mask=None):
    if not self.mask_zero:
        return None
    output_mask = K.not_equal(inputs, 0)
    return output_mask

thus the entire kernel is still multiplied by the input, meaning all indexes are shifted up by one.

115

answered Oct 21 '22 11:10

modesitt

Related questions
                            
                                Add multiple csv in a single csv sheet in tabs using Pandas
                            
                                Pyqtgraph & Changing color base on height for surfaceplot
                            
                                query foreign key table for list view in django
                            
                                How do I export a TensorFlow model as a .tflite file?
                            
                                Rotate x axis labels in Matplotlib parasite plot
                            
                                How to disable opening the page in a new tab in Selenium Webdriver in Python?
                            
                                groupby a column and count items above 5 in another pandas
                            
                                Guaranteeing calling to destruction on process termination
                            
                                ThreadPoolExecutor, ProcessPoolExecutor and global variables
                            
                                Finding weak ties using networkx
                            
                                Python files to an MSI Windows installer
                            
                                python itertools round robin explaintation
                            
                                Why am I getting an invalid syntax error in Python REPL right after IF statement?
                            
                                Dropping rows in pandas with .index
                            
                                Keras Word2Vec implementation
                            
                                Pandas: How to add column to multiindexed dataframe?
                            
                                Faster alternative to iterrows
                            
                                Sum matrix elements group by indices in Python
                            
                                Python mypy unable to infer type from union return types
                            
                                Is there a way to take screenshot of a window in pyqt5 or qt5?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to correctly use mask_zero=True for Keras Embedding with pre-trained weights?

Tags:

python

tensorflow

keras

word-embedding

AstroBen

People also ask

1 Answers

modesitt

Recent Activity

Donate For Us