After testing the script of [babi_rnn.py] and [babi_memnn.py], the question of [how to determine which Merge mode (add/ average/ multiply/ dot / concat) to use?] raised up many times in my mind.
For example, for the LSTM modeling,it seems easy to understand that using [concat] to merge let's say two-branches's time sequence layer output.
However, it is not that easy for me to understand why to use [add] to merge two branches in [babi_rnn.py]. In [babi_memnn.py], the [add],[dot] and [concat] merging modes are recruited.
So, is there any suggestions for choosing which merging function to use in different usage scenarios?
These Merge functions fall into 3 categories.
add
, avg
are linear combinations. It is used for simply combining several distinct components together because gradient flows nicely through addition and subtraction. A common use case is adding(+) several criterion together to obtain a loss function for a neural network that trains on multiple tasks jointly.
Another example is L2 regularization:
L2 regularization aims to minimize variance in weights. So the bigger the weights, the higher the loss.
multiply
is a a special case of dot
. In Keras, you can specify axis using dot
. Dot product is used for determining how similar two or more vectors are to each other. Note: dot
product is in fact a shrink operation. Its magnitude will be smaller or equal to either of the original inputs. Demonstrated geometrically as projection:
concat
does not discard any input. The concatenated vector can then be fed into a hidden layer to be rescaled elementwise. You don't find the interaction between elements. One common practice is concatenating the hidden state and output of stacked RNN and feeding that into a Dense
layer to have several RNN do different tasks similar to a feedforward network.
To sum up, each Merge operation has a different use case. In Luong Attention paper, there are 3 proposed scoring mechanism. Depending on your model, you can pick and choose the one that works best for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With