Keras embedding layer masking. Why does input_dim need to be |vocabulary| + 2?

Tags:

In the Keras docs for Embedding https://keras.io/layers/embeddings/, the explanation given for mask_zero is

mask_zero: Whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is True then all subsequent layers in the model need to support masking or an exception will be raised. If mask_zero is set to True, as a consequence, index 0 cannot be used in the vocabulary (input_dim should equal |vocabulary| + 2).

Why does input_dim need to be 2 + number of words in vocabulary? Assuming 0 is masked and can't be used, shouldn't it just be 1 + number of words? What is the other extra entry for?

801

asked Apr 05 '17 10:04

Nigel Ng

1 Answers

I believe the docs are a bit misleading there. In the normal case you are mapping your n input data indices [0, 1, 2, ..., n-1] to vectors, so your input_dim should be as many elements as you have

input_dim = len(vocabulary_indices)

An equivalent (but slightly confusing) way to say this, and the way the docs do, is to say

1 + maximum integer index occurring in the input data.

input_dim = max(vocabulary_indices) + 1

If you enable masking, value 0 is treated differently, so you increment your n indices by one: [0, 1, 2, ..., n-1, n], thus you need

input_dim = len(vocabulary_indices) + 1

or alternatively

input_dim = max(vocabulary_indices) + 2

The docs become especially confusing here as they say

(input_dim should equal |vocabulary| + 2)

where I would interpret |x| as the cardinality of a set (equivalent to len(x)), but the authors seem to mean

2 + maximum integer index occurring in the input data.

188

answered Oct 10 '22 05:10

Nils Werner

Related questions
                            
                                Sqlalchemy : Dynamically create table from Scrapy item
                            
                                How to color voronoi according to a color scale ? And the area of each cell
                            
                                Implement roles in django rest framework
                            
                                How to not update homebrew automatically when brew install some packages?
                            
                                Connecting psycopg2 with Python in Heroku
                            
                                Loading Torch7 trained models (.t7) in PyTorch
                            
                                Could not open codec 'libopenh264': Unspecified error
                            
                                The following packages will be SUPERCEDED by a higher-priority channel
                            
                                NameError: name 'urllib' is not defined "
                            
                                How to specify where to start in an itertools.cycle function
                            
                                Django manager first() vs Model.objects.all()[:1]
                            
                                How do you compute accuracy in a regression model, after rounding predictions to classes, in keras?
                            
                                bokeh DataTable with conditionally coloured cells
                            
                                Python Unittest throws uncaught TypeError: __init__() takes 1 positional argument but 2 were given
                            
                                PyYaml.load_all() returns generator rather than a dict?
                            
                                Django cannot delete single object after rewriting model.Manager method
                            
                                Change default Ubuntu pip to pip2.7
                            
                                Turn off graphs while running unittests
                            
                                Matplotlib Scatter - ValueError: RGBA sequence should have length 3 or 4
                            
                                What does "del sys.modules[module]" actually do?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Keras embedding layer masking. Why does input_dim need to be |vocabulary| + 2?

Tags:

python

deep-learning

nlp

keras

keras-layer

Nigel Ng

People also ask

1 Answers

Nils Werner

Recent Activity

Donate For Us