The documentation for the Embedding layer is here:
https://keras.io/layers/embeddings/
and the documentation for the Masking layer is here:
https://keras.io/layers/recurrent/
I cant find a difference there. Should one of the layers be prefered in certain situations?
I feel like Masking() is more masking of time steps; while Embedding(mask_zero=True) is more of a data filter. Masking:
If all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers
With an arbitrary mask_value. Thus, you can decide to skip time steps in which there is no input, or some other condition you can think of, based on your data.
For Embedding, you overlay a mask on your input skipping calculations for data for which the input=0. This way, you can, in a single time step, propagate full data, part of the data, of no data through the network. This is not a masking of time step #3 or something like that, it is a masking of input data #i. Also, only having no input (input=zero) can be masked.
Thus, there are certainly cases I can think of where the two are completely equal (when an input = 0, it is 0 for all inputs would be such a case), but their use is on another resolution.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With