In my model, I use GloVe pre-trained embeddings. I wish to keep them non-trainable in order to decrease the number of model parameters and avoid overfit. However, I have a special symbol whose embedding I do want to train. Using the provided Embedding Layer, I can only use the parameter 'trainable' to set the trainability of all embeddings in the following way: <pre class="prettyprint"><code>embedding_layer = Embedding(voc_size, emb_dim, weights=[embedding_matrix], input_length=MAX_LEN, trainable=False) </code></pre> Is there a Keras-level solution to training only a subset of embeddings? Please note: <ol> <li>There is not enough data to generate new embeddings for all words.</li> <li> These answers only relate to native TensorFlow.</li> </ol>

I haven't found a nice solution like a mask for the Embedding layer. But here's what I've been meaning to try: <ul> <li>Two embedding layers - one trainable and one not</li> <li>The non-trainable one has all the Glove embeddings for in-vocab words and zero vectors for others</li> <li>The trainable one only maps the OOV words and special symbols</li> <li>The output of these two layers is added (I was thinking of this like ResNet)</li> <li>The Conv/LSTM/etc below the embedding is unchanged</li> </ul> That would get you a solution with a small number of free parameters allocated to those embeddings.

Train only some word embeddings (Keras)

Tags:

python

nlp

keras

word-embedding

In my model, I use GloVe pre-trained embeddings. I wish to keep them non-trainable in order to decrease the number of model parameters and avoid overfit. However, I have a special symbol whose embedding I do want to train.

Using the provided Embedding Layer, I can only use the parameter 'trainable' to set the trainability of all embeddings in the following way:

embedding_layer = Embedding(voc_size,
                        emb_dim,
                        weights=[embedding_matrix],
                        input_length=MAX_LEN,
                        trainable=False)

Is there a Keras-level solution to training only a subset of embeddings?

Please note:

There is not enough data to generate new embeddings for all words.
These answers only relate to native TensorFlow.

746

asked Feb 27 '18 13:02

miclat

2 Answers

Found some nice workaround, inspired by Keith's two embeddings layers.

Main idea:

Assign the special tokens (and the OOV) with the highest IDs. Generate a 'sentence' containing only special tokens, 0-padded elsewhere. Then apply non-trainable embeddings to the 'normal' sentence, and trainable embeddings to the special tokens. Lastly, add both.

Works fine to me.

    # Normal embs - '+2' for empty token and OOV token
    embedding_matrix = np.zeros((vocab_len + 2, emb_dim))
    # Special embs
    special_embedding_matrix = np.zeros((special_tokens_len + 2, emb_dim))

    # Here we may apply pre-trained embeddings to embedding_matrix

    embedding_layer = Embedding(vocab_len + 2,
                        emb_dim,
                        mask_zero = True,
                        weights = [embedding_matrix],
                        input_length = MAX_SENT_LEN,
                        trainable = False)

    special_embedding_layer = Embedding(special_tokens_len + 2,
                            emb_dim,
                            mask_zero = True,
                            weights = [special_embedding_matrix],
                            input_length = MAX_SENT_LEN,
                            trainable = True)

    valid_words = vocab_len - special_tokens_len

    sentence_input = Input(shape=(MAX_SENT_LEN,), dtype='int32')

    # Create a vector of special tokens, e.g: [0,0,1,0,3,0,0]
    special_tokens_input = Lambda(lambda x: x - valid_words)(sentence_input)
    special_tokens_input = Activation('relu')(special_tokens_input)

    # Apply both 'normal' embeddings and special token embeddings
    embedded_sequences = embedding_layer(sentence_input)
    embedded_special = special_embedding_layer(special_tokens_input)

    # Add the matrices
    embedded_sequences = Add()([embedded_sequences, embedded_special])

172

answered Oct 12 '22 14:10

miclat

I haven't found a nice solution like a mask for the Embedding layer. But here's what I've been meaning to try:

Two embedding layers - one trainable and one not
The non-trainable one has all the Glove embeddings for in-vocab words and zero vectors for others
The trainable one only maps the OOV words and special symbols
The output of these two layers is added (I was thinking of this like ResNet)
The Conv/LSTM/etc below the embedding is unchanged

That would get you a solution with a small number of free parameters allocated to those embeddings.

answered Oct 12 '22 16:10

Keith

Related questions
                            
                                TensorFlow - tf.layers vs tf.contrib.layers
                            
                                Index out of range when using lambda [duplicate]
                            
                                Pandas - Groupby with conditional formula
                            
                                Improve performance of converting numpy array to MATLAB double
                            
                                Python static method is not always callable
                            
                                Setup in virtualenv: `pip install -e .` vs `python setup.py install`
                            
                                Sorting a list: numbers in ascending, letters in descending
                            
                                Merge MultiIndex columns together into 1 level [duplicate]
                            
                                Python Keras LSTM learning converges too fast on high loss
                            
                                python -docx to extract table from word docx
                            
                                How to get Predictions with XGBoost and XGBoost using Scikit-Learn Wrapper to match?
                            
                                Numpy: assigning values to 2d array with list of indices
                            
                                Django - Supervisor : exited too quickly
                            
                                How to setup working directory in VS Code for pylint?
                            
                                Find locations on a curve where the slope changes
                            
                                Python Pandas groupby apply lambda arguments
                            
                                Efficient way to compute the Vandermonde matrix
                            
                                How to import data into google colab from google drive?
                            
                                ImportError: No module named google.oauth2
                            
                                'DataFrame' object has no attribute 'ravel' when transforming target variable?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Train only some word embeddings (Keras)

Tags:

python

nlp

keras

word-embedding

miclat

People also ask

2 Answers

miclat

Keith

Recent Activity

Donate For Us