I've already asked this question on Keras issues but since I've gotten no answers there I decided to try my luck here.
I'm running the mnist mlp example with a custom optimizer which for the time being is just a carbon copy of SGD from optimizers.py, i.e.
from __future__ import print_function
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Optimizer
from keras import backend as K
from legacy import interfaces
import numpy as np
class testsgd(Optimizer):
..... [everything same as sgd] .....
myopt = testsgd()
....[define model]....
model.compile(loss='categorical_crossentropy',
optimizer=myopt,
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Now, in my custom optimizer I need to compute the dot product of the gradient with the velocity, i.e. after line 168 in optimizers.py, I need something similar to
angle = K.dot(g,v)
or
angle = K.dot(K.transpose(g),v)
or
angle = K.dot(g, K.transpose(v))
Unfortunately none of the above work, I just get the error
ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [512], [512].
I understand that g
and v
are tensors which perhaps might need to be flattened to numpy arrays so as to use numpy for the dot product.
The closest that I came was by inspecting line 75 in optimizers.py, which calculates the norm of the gradient, i.e.
norm = K.sqrt(sum([K.sum(K.square(g)) for g in grads]))
However, even then, the statement
print(norm)
still returns a tensor!
Similarly I have also tried
angle = K.sum(g * v,axis=-1,keepdims=True)
as suggested here but still the result is a tensor which I cannot interpret as correct or not:
Tensor("Sum_2:0", shape=(1,), dtype=float32)
When I try
print (K.get_value(angle))
I just get
InvalidArgumentError (see above for traceback): Shape [-1,784] has negative dimensions [[Node: dense_4_input = Placeholderdtype=DT_FLOAT, shape=[?,784], _device="/job:localhost/replica:0/task:0/gpu:0"]]
Many thanks in advance for any help
Use K.get_value(x)
to get scalar of a tensor.
tf.keras.backend.get_value
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With