For example, y=Ax
where A
is an diagonal matrix, with its trainable weights (w1, w2, w3
) on the diagonal.
A = [w1 ... ...
... w2 ...
... ... w3]
How to create such trainable A
in Tensorflow or Keras?
If I try A = tf.Variable(np.eye(3))
, the total number of trainable weights would be 3*3=9, not 3. Because I only want to update (w1,w2,w3) that 3 weights.
A trick may be to use A = tf.Variable([1, 1, 1]) * np.eye(3)
, so that the 3 trainable weights are mapped into the diagonal of A
.
My question is:
Would that trick work for my purpose? Would the gradient be correctly calculated?
What if the situation of A
is more complicated? E.g. if I want to create:
where the w1, w2, ..., w6
are weights to be updated.
To make this easier, the variable constructor supports a trainable=<bool> parameter. tf. GradientTape watches trainable variables by default: with tf.
Freezing (by setting layer. trainable = False) prevents the weights in a given layer from being updated during training.
To initialize a new variable from the value of another variable use the other variable's initialized_value() property. You can use the initialized value directly as the initial value for the new variable, or you can use it as any other tensor to compute a value for the new variable.
First, remember that you can use the TensorFlow eye functionality to easily create a square identity matrix. We create a 5x5 identity matrix with a data type of float32 and assign it to the Python variable identity matrix. So we used tf. eye, give it a size of 5, and the data type is float32.
You have two different tools to address this problem.
Both approach are not exclusive and you could you a mix of successives steps of type #1 and #2.
For example, for your first example (diagonal matrix), we can use approach #1.
w = tf.Variable(tf.zeros(n))
A = tf.diag(w) # creates a diagonal matrix with elements of w
For your second, more complex example, we could use approach #2.
A = tf.Variable(tf.zeros((n, n)))
A = tf.matrix_band_part(A, 1, 1) # keep only the central band of width 3
A = tf.matrix_set_diag(A, tf.ones(n)) # set diagonal to 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With