I am trying to train a CNN
model in TensorFlow 2.0. It's a multiclass classification task. I am simplifying the code to make it more readable:
# Loss function
loss = tf.keras.metrics.CategoricalCrossentropy()
# Optimizer
optimizer = tf.optimizers.Adam(learning_rate = 0.0005)
# Training:
for epoch in range(1000):
# fetch mini batch of data
X_batch, y_batch = fetch_batch( [...] )
with tf.GradientTape() as tape:
current_loss = loss(y_batch, CNN(X_batch)) # take current loss
# get the gradient of the loss function
gradients = tape.gradient(current_loss, CNN.trainable_variables)
# update weights
optimizer.apply_gradients(zip(gradients, CNN.trainable_variables))
[ ... ]
At this point, I get an error:
ValueError: No gradients provided for any variable ...
I know where the problem is: something goes wrong when I call tape.gradient()
. If I check the object gradient
this is what I get:
print(gradients)
[None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None]
I don't understand why gradients
is returned like this. I have literally copy-pasted the code for training other (non-CNN) models in TF 2.0, and they always worked out very well. All the others element of my model seem to behave as they should.
--
PS: this question is different from this one, which is based on TF 1.x.
I think you want tf.keras.losses.CategoricalCrossentropy
as your loss, not the metrics
version. These are actually different functions, not aliases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With