How to update weights manually with Keras

Tags:

I'm using Keras to build a LSTM and tuning it by doing gradient descent with an external cost function. So the weights are updated with:

weights := weights + alpha* gradient(cost)

I know that I can get the weights with keras.getweights(), but how can I do the gradient descent and update all weights and update the weights correspondingly. I try to use initializer, but I still didn't figure it out. I only found some related code with tensorflow but I don't know how to convert it to Keras.

Any help, hint or advice will be appreciated!

952

asked Jul 16 '18 03:07

beepretty

2 Answers

keras.layer.set_weights() is what you are looking for:

import numpy as np
from keras.layers import Dense
from keras.models import Sequential

model = Sequential()
model.add(Dense(10, activation='relu', input_shape=(10,)))
model.add(Dense(5, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='categorical_crossentropy')

a = np.array(model.get_weights())         # save weights in a np.array of np.arrays
model.set_weights(a + 1)                  # add 1 to all weights in the neural network
b = np.array(model.get_weights())         # save weights a second time in a np.array of np.arrays
print(b - a)                              # print changes in weights

Have a look at the respective page of the keras documentation here.

answered Sep 22 '22 00:09

apitsch

You need some TensorFlow to compute the symbolic gradient. Here is a toy example using Keras and then digging in a little bit to manually perform the step-wise descent in TensorFlow.

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras import backend as k
from keras import losses
import numpy as np
import tensorflow as tf
from sklearn.metrics import mean_squared_error
from math import sqrt

model = Sequential()
model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(8, kernel_initializer='uniform', activation='relu'))
model.add(Dense(8, kernel_initializer='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

inputs = np.random.random((1, 8))
outputs = model.predict(inputs)
targets = np.random.random((1, 8))
rmse = sqrt(mean_squared_error(targets, outputs))

print("===BEFORE WALKING DOWN GRADIENT===")
print("outputs:\n", outputs)
print("targets:\n", targets)
print("RMSE:", rmse)


def descend(steps=40, learning_rate=100.0, learning_decay=0.95):
    for s in range(steps):

        # If your target changes, you need to update the loss
        loss = losses.mean_squared_error(targets, model.output)

        #  ===== Symbolic Gradient =====
        # Tensorflow Tensor Object
        gradients = k.gradients(loss, model.trainable_weights)

        # ===== Numerical gradient =====
        # Numpy ndarray Objcet
        evaluated_gradients = sess.run(gradients, feed_dict={model.input: inputs})

        # For every trainable layer in the network
        for i in range(len(model.trainable_weights)):

            layer = model.trainable_weights[i]  # Select the layer

            # And modify it explicitly in TensorFlow
            sess.run(tf.assign_sub(layer, learning_rate * evaluated_gradients[i]))

        # decrease the learning rate
        learning_rate *= learning_decay

        outputs = model.predict(inputs)
        rmse = sqrt(mean_squared_error(targets, outputs))

        print("RMSE:", rmse)


if __name__ == "__main__":
    # Begin TensorFlow
    sess = tf.InteractiveSession()
    sess.run(tf.initialize_all_variables())

    descend(steps=5)

    final_outputs = model.predict(inputs)
    final_rmse = sqrt(mean_squared_error(targets, final_outputs))

    print("===AFTER STEPPING DOWN GRADIENT===")
    print("outputs:\n", final_outputs)
    print("targets:\n", targets)

RESULTS:

===BEFORE WALKING DOWN GRADIENT===
outputs:
 [[0.49995303 0.5000101  0.50001436 0.50001544 0.49998832 0.49991882
  0.49994195 0.4999649 ]]
targets:
 [[0.60111501 0.70807258 0.02058449 0.96990985 0.83244264 0.21233911
  0.18182497 0.18340451]]
RMSE: 0.33518919408969455
RMSE: 0.05748867468895
RMSE: 0.03369414290610595
RMSE: 0.021872132066183464
RMSE: 0.015070048653579693
RMSE: 0.01164369828903875
===AFTER STEPPING DOWN GRADIENT===
outputs:
 [[0.601743   0.707857   0.04268148 0.9536494  0.8448022  0.20864952
  0.17241994 0.17464897]]
targets:
 [[0.60111501 0.70807258 0.02058449 0.96990985 0.83244264 0.21233911
  0.18182497 0.18340451]]

answered Sep 22 '22 00:09

birdmw

Related questions
                            
                                How to prettyprint (human readably print) a Python dict in json format (double quotes)? [duplicate]
                            
                                what should be in gitignore, and how do I put env folder to gitignore and is my folder structure correct?
                            
                                Django nested if else in templates
                            
                                Putting a variable into a string (quote)
                            
                                F1-score per class for multi-class classification
                            
                                Tensorflow: Confusion regarding the adam optimizer
                            
                                Import data from excel spreadsheet to django model
                            
                                Multiple lookup_fields for django rest framework
                            
                                Converting pandas.core.series.Series to dataframe with appropriate column values python
                            
                                Global Weight Decay in Keras
                            
                                Is there a built-in KL divergence loss function in TensorFlow?
                            
                                More idiomatic way to display images in a grid with numpy
                            
                                How to make pytest wait for (manual) user action?
                            
                                How to verify JWT id_token produced by MS Azure AD?
                            
                                How much memory will a list with one million elements take up in Python?
                            
                                How tf.transpose works in tensorflow?
                            
                                What is the role of npartitions in a Dask dataframe?
                            
                                PANDAS split dataframe to multiple by unique values rows
                            
                                Python Falcon - get POST data
                            
                                How to get the path of a function in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to update weights manually with Keras

Tags:

performance

python

keras

recurrent-neural-network

reinforcement-learning

beepretty

People also ask

2 Answers

apitsch

birdmw

Recent Activity

Donate For Us