I have a deep neural network where the weights between layers are stored in a list.
layers[j].weights
I want to incluse the ridge penalty in my cost function. I need then to use something like
tf.nn.l2_loss(layers[j].weights**2 for j in range(self.n_layers))
i.e. the squared sum of all the weights.
In particular the weights are defined as:
>>> avs.layers
[<neural_network.Layer object at 0x10a4b2a90>, <neural_network.Layer object at 0x10ac85080>, <neural_network.Layer object at 0x10b0f3278>, <neural_network.Layer object at 0x10b0eacf8>, <neural_network.Layer object at 0x10b145588>, <neural_network.Layer object at 0x10b165048>, <neural_network.Layer object at 0x10b155ba8>]
>>>
>>> avs.layers[0].weights
<tensorflow.python.ops.variables.Variable object at 0x10b026748>
>>>
How can I do that in tensorflow ?
The tf. sum() function is used to calculate sum of the elements of a specified Tensor across its dimension. It reduces the given input elements along the dimensions of axes. If the parameter “keepDims” is true, the reduced dimensions are retained with length 1 else the rank of Tensor is reduced by 1.
add() Function. The tf. add() function returns the addition of two tf. Tensor objects element wise.
reduce_sum() is used to find sum of elements across dimensions of a tensor. Syntax: tensorflow.math.reduce_sum( input_tensor, axis, keepdims, name) Parameters: input_tensor: It is numeric tensor to reduce. axis(optional): It represent the dimensions to reduce.
Add two or more tensors using torch. add() and assign the value to a new variable. You can also add a scalar quantity to the tensor. Adding the tensors using this method does not make any change in the original tensors.
The standard way to sum a list of tensors is to use the tf.add_n()
operation, which takes a list of tensors (each having the same size and shape) and produces a single tensor containing the sum.
For the particular problem that you have, I am assuming that each layers[j].weights
could have a different size. Therefore you will need reduce each element down to a scalar before summing, e.g. using the tf.nn.l2_loss()
function itself:
weights = [layers[j].weights for j in range(self.n_layers)]
losses = [tf.nn.l2_loss(w) for w in weights]
total_loss = tf.add_n(losses)
(Note however that when the values to be added are large, you may find it more efficient to calculate a sequence of tf.add()
operations, since TensorFlow keeps the values of each of the add_n
arguments in memory until all of them have been computed. A chain of add
ops allows some of the computation to happen earlier.)
The tf.nn.l2_loss()
function returns a tensor with 0 dimensions.
But it's nice to not need to manually apply that to each weight tensor, so storing the weight tensors in a list is one way to solve the problem (as @mrry noted).
But rather than needing to write that out every time, what you could do is use the following function
def l2_loss_sum(list_o_tensors):
return tf.add_n([tf.nn.l2_loss(t) for t in list_o_tensors])
In your case this would look like:
total_loss = l2_loss_sum([layers[j].weights for j in range(self.n_layers)])
Also, tf.nn.l2_loss()
implicitly applies the squaring operation to the values as well as multiplying all the squared values by 1/2 , so were you use something like tf.nn.l2_loss(layers[j].weights**2 for j in range(self.n_layers))
you would actually be raising the weights to the 4th power. As a result your derivative of this loss term would be strange: it wouldn't cancel the 1/2 to 1 (but would implicitly be doubling your β) and the weights would be cubed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With