I have a tensor of shape [x, y] and I want to subtract the mean and divide by the standard deviation row-wise (i.e. I want to do it for each row). What is the most efficient way to do this in TensorFlow?
Of course I can loop through rows as follows:
new_tensor = [i - tf.reduce_mean(i) for i in old_tensor]
...to subtract the mean and then do something similar to find the standard deviation and divide by it, but is this the best way to do it in TensorFlow?
The TensorFlow tf.sub() and tf.div() operators support broadcasting, so you don't need to iterate through every row. Let's consider the mean, and leave standard deviation as an exercise:
old_tensor = ...                                          # shape = (x, y)
mean = tf.reduce_mean(old_tensor, 1, keep_dims=True)      # shape = (x, 1)                    
stdev = ...                                               # shape = (x,)
stdev = tf.expand_dims(stdev, 1)                          # shape = (x, 1)
new_tensor = old_tensor - mean                            # shape = (x, y)
new_tensor = old_tensor / stdev                           # shape = (x, y)
The subtraction and division operators implicitly broadcast a tensor of shape (x, 1) along the column dimension to match the shape of the other argument, (x, y). For more details about how broadcasting works, see the NumPy documentation on the topic (TensorFlow implements NumPy broadcasting semantics).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With