Reading the Tensorflow MNIST tutorial, I stumbled over the line
x_image = tf.reshape(x, [-1,28,28,1])
28, 28
comes from width, height
, 1
comes from the number of channels. But why -1
?
I guess this is related to mini-batch training, but I wondered why -1
and not 1
(which seems to give the same result in numpy).
(Probably related: Why does the reshape of numpy give the same results for -1
,-2
and 1
)?
-1
means that the length in that dimension is inferred. This is done based on the constraint that the number of elements in an ndarray
or Tensor
when reshaped must remain the same. In the tutorial, each image is a row vector (784 elements) and there are lots of such rows (let it be n
, so there are 784n
elements). So, when you write
x_image = tf.reshape(x, [-1, 28, 28, 1])
TensorFlow can infer that -1
is n
.
In the MNIST tutorial that you are reading, the desired shape for your input layer : [batch_size, 28, 28, 1]
x_image = tf.reshape(x, [-1,28,28,1])
Here -1 for input x specifies that this dimension should be dynamically computed based on the number of input values in x, holding the size of all other dimensions constant. This allows us to treat batch_size(parameter with value -1) as a hyperparameter that we can tune.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With