I thought <code>mask_zero=True</code> will output 0's when the input value is 0, so the following layers could skip computation or something. How does <code>mask_zero</code> works? Example: <pre class="prettyprint"><code>data_in = np.array([ [1, 2, 0, 0] ]) data_in.shape >>> (1, 4) # model x = Input(shape=(4,)) e = Embedding(5, 5, mask_zero=True)(x) m = Model(inputs=x, outputs=e) p = m.predict(data_in) print(p.shape) print(p) </code></pre> The actual output is: (the numbers are random) <pre class="prettyprint"><code>(1, 4, 5) [[[ 0.02499047 0.04617121 0.01586803 0.0338897 0.009652 ] [ 0.04782704 -0.04035913 -0.0341589 0.03020919 -0.01157228] [ 0.00451764 -0.01433611 0.02606953 0.00328832 0.02650392] [ 0.00451764 -0.01433611 0.02606953 0.00328832 0.02650392]]] </code></pre> However, I thought the output will be: <pre class="prettyprint"><code>[[[ 0.02499047 0.04617121 0.01586803 0.0338897 0.009652 ] [ 0.04782704 -0.04035913 -0.0341589 0.03020919 -0.01157228] [ 0 0 0 0 0] [ 0 0 0 0 0]]] </code></pre>

The process of informing the Model that some part of the Data is actually Padding and should be ignored is called Masking. There are three ways to introduce <code>input masks</code> in Keras models: <ol> <li>Add a <code>keras.layers.Masking</code> layer.</li> <li>Configure a <code>keras.layers.Embedding</code> layer with <code>mask_zero=True</code>.</li> <li>Pass a mask argument manually when calling layers that support this argument (e.g. RNN layers).</li> </ol> Given below is the code to introduce <code>Input Masks</code> using <code>keras.layers.Embedding</code> <pre class="prettyprint"><code>import numpy as np import tensorflow as tf from tensorflow.keras import layers raw_inputs = [[83, 91, 1, 645, 1253, 927],[73, 8, 3215, 55, 927],[711, 632, 71]] padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(raw_inputs, padding='post') print(padded_inputs) embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True) masked_output = embedding(padded_inputs) print(masked_output._keras_mask) </code></pre> Output of the above code is shown below: <pre class="prettyprint"><code>[[ 83 91 1 645 1253 927] [ 73 8 3215 55 927 0] [ 711 632 71 0 0 0]] tf.Tensor( [[ True True True True True True] [ True True True True True False] [ True True True False False False]], shape=(3, 6), dtype=bool) </code></pre> For more information, refer this Tensorflow Tutorial.

How does mask_zero in Keras Embedding layer work?

Tags:

python

machine-learning

keras

word-embedding

I thought mask_zero=True will output 0's when the input value is 0, so the following layers could skip computation or something.

How does mask_zero works?

Example:

Click to copy

data_in = np.array([
  [1, 2, 0, 0]
])
data_in.shape
>>> (1, 4)

# model
x = Input(shape=(4,))
e = Embedding(5, 5, mask_zero=True)(x)

m = Model(inputs=x, outputs=e)
p = m.predict(data_in)
print(p.shape)
print(p)

The actual output is: (the numbers are random)

Click to copy

(1, 4, 5)
[[[ 0.02499047  0.04617121  0.01586803  0.0338897   0.009652  ]
  [ 0.04782704 -0.04035913 -0.0341589   0.03020919 -0.01157228]
  [ 0.00451764 -0.01433611  0.02606953  0.00328832  0.02650392]
  [ 0.00451764 -0.01433611  0.02606953  0.00328832  0.02650392]]]

However, I thought the output will be:

Click to copy

[[[ 0.02499047  0.04617121  0.01586803  0.0338897   0.009652  ]
  [ 0.04782704 -0.04035913 -0.0341589   0.03020919 -0.01157228]
  [ 0 0 0 0 0]
  [ 0 0 0 0 0]]]

472

asked Nov 25 '17 11:11

crazytomcat

2 Answers

Actually, setting mask_zero=True for the Embedding layer does not result in returning a zero vector. Rather, the behavior of the Embedding layer would not change and it would return the embedding vector with index zero. You can confirm this by checking the Embedding layer weights (i.e. in the example you mentioned it would be m.layers[0].get_weights()). Instead, it would affect the behavior of the following layers such as RNN layers.

If you inspect the source code of Embedding layer you would see a method called compute_mask:

Click to copy

def compute_mask(self, inputs, mask=None):
    if not self.mask_zero:
        return None
    output_mask = K.not_equal(inputs, 0)
    return output_mask

This output mask will be passed, as the mask argument, to the following layers which support masking. This has been implemented in the __call__ method of base layer, Layer:

Click to copy

# Handle mask propagation.
previous_mask = _collect_previous_mask(inputs)
user_kwargs = copy.copy(kwargs)
if not is_all_none(previous_mask):
    # The previous layer generated a mask.
    if has_arg(self.call, 'mask'):
        if 'mask' not in kwargs:
            # If mask is explicitly passed to __call__,
            # we should override the default mask.
            kwargs['mask'] = previous_mask

And this makes the following layers to ignore (i.e. does not consider in their computations) this inputs steps. Here is a minimal example:

Click to copy

data_in = np.array([
  [1, 0, 2, 0]
])

x = Input(shape=(4,))
e = Embedding(5, 5, mask_zero=True)(x)
rnn = LSTM(3, return_sequences=True)(e)

m = Model(inputs=x, outputs=rnn)
m.predict(data_in)

array([[[-0.00084503, -0.00413611,  0.00049972],
        [-0.00084503, -0.00413611,  0.00049972],
        [-0.00144554, -0.00115775, -0.00293898],
        [-0.00144554, -0.00115775, -0.00293898]]], dtype=float32)

As you can see the outputs of the LSTM layer for the second and forth timesteps are the same as the output of first and third timesteps, respectively. This means that those timesteps have been masked.

Update: The mask will also be considered when computing the loss since the loss functions are internally augmented to support masking using weighted_masked_objective:

Click to copy

def weighted_masked_objective(fn):
    """Adds support for masking and sample-weighting to an objective function.
    It transforms an objective function `fn(y_true, y_pred)`
    into a sample-weighted, cost-masked objective function
    `fn(y_true, y_pred, weights, mask)`.
    # Arguments
        fn: The objective function to wrap,
            with signature `fn(y_true, y_pred)`.
    # Returns
        A function with signature `fn(y_true, y_pred, weights, mask)`.
    """

when compiling the model:

Click to copy

weighted_losses = [weighted_masked_objective(fn) for fn in loss_functions]

You can verify this using the following example:

Click to copy

data_in = np.array([[1, 2, 0, 0]])
data_out = np.arange(12).reshape(1,4,3)

x = Input(shape=(4,))
e = Embedding(5, 5, mask_zero=True)(x)
d = Dense(3)(e)

m = Model(inputs=x, outputs=d)
m.compile(loss='mse', optimizer='adam')
preds = m.predict(data_in)
loss = m.evaluate(data_in, data_out, verbose=0)
print(preds)
print('Computed Loss:', loss)

[[[ 0.009682    0.02505393 -0.00632722]
  [ 0.01756451  0.05928303  0.0153951 ]
  [-0.00146054 -0.02064196 -0.04356086]
  [-0.00146054 -0.02064196 -0.04356086]]]
Computed Loss: 9.041069030761719

# verify that only the first two outputs 
# have been considered in the computation of loss
print(np.square(preds[0,0:2] - data_out[0,0:2]).mean())

9.041070036475277

answered Oct 03 '22 02:10

today

The process of informing the Model that some part of the Data is actually Padding and should be ignored is called Masking.

There are three ways to introduce input masks in Keras models:

Add a keras.layers.Masking layer.
Configure a keras.layers.Embedding layer with mask_zero=True.
Pass a mask argument manually when calling layers that support this argument (e.g. RNN layers).

Given below is the code to introduce Input Masks using keras.layers.Embedding

Click to copy

import numpy as np

import tensorflow as tf

from tensorflow.keras import layers

raw_inputs = [[83, 91, 1, 645, 1253, 927],[73, 8, 3215, 55, 927],[711, 632, 71]]
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(raw_inputs,
                                                              padding='post')

print(padded_inputs)

embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)
masked_output = embedding(padded_inputs)

print(masked_output._keras_mask)

Output of the above code is shown below:

Click to copy

[[  83   91    1  645 1253  927]
 [  73    8 3215   55  927    0]
 [ 711  632   71    0    0    0]]

tf.Tensor(
[[ True  True  True  True  True  True]
 [ True  True  True  True  True False]
 [ True  True  True False False False]], shape=(3, 6), dtype=bool)

For more information, refer this Tensorflow Tutorial.

answered Oct 03 '22 00:10

Tensorflow Support

Related questions
                            
                                how to explain the decision tree from scikit-learn
                            
                                Why is it valid to assign to an empty list but not to an empty tuple?
                            
                                Replace nth occurrence of substring in string
                            
                                Make a deep copy of a keras model in python
                            
                                Fastest way to download 3 million objects from a S3 bucket
                            
                                How to plot a density map in python?
                            
                                Pretty print namedtuple
                            
                                What is the meaning of [:] in python [duplicate]
                            
                                Access element of a vector in a Spark DataFrame (Logistic Regression probability vector) [duplicate]
                            
                                How to write meaningful docstrings?
                            
                                How to copy a sqlite table from a disk database to a memory database in python? [duplicate]
                            
                                How can I store an array of strings in a Django model?
                            
                                Sending ^C to Python subprocess objects on Windows
                            
                                Executing tasks in parallel in python
                            
                                changing the class of a python object (casting)
                            
                                OpenCV - visualize polygonal curve(s) extracted with cv2.approxPolyDP()
                            
                                How do you set up a Python WSGI server under IIS?
                            
                                Create a temporary FIFO (named pipe) in Python?
                            
                                How to properly union with set
                            
                                How do I rename a (work)sheet in a Google Sheets spreadsheet using the API in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does mask_zero in Keras Embedding layer work?

Tags:

python

machine-learning

keras

word-embedding

crazytomcat

People also ask

2 Answers

today

Tensorflow Support

Recent Activity

Donate For Us