I'm learning convolutional networks and python all at once.
I have a problem with the following code:
import tensorflow as tf
print(tf.__version__)
mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
training_images=training_images.reshape(60000, 28, 28, 1)
I don't understand what reshape(60000, 28, 28, 1)
means.
What are 60000 and 28 and 28 and 1?
I will get 60000 arrays of 28 columns by 28 rows... and the 1 is...
Think about it, how would you store 60k images 28 by 28 pixels if it was RGB?
For each pixel you would need 3 scalars (each for one channel), so it would be 60000x28x28x3.
And how many channels you need when the image is in greyscale? Just one, so it would be 60000x28x28x1
Of course, in case of one channel this could be simplified even more to 60000x28x28, but I would say the former approach is better because you give explicitly information about how many channels the image has and looks like some ML frameworks require that information to operate correctly.
The reshape here is to put every image of the dataset into one tensor, fashion mnist dataset contains 60 000 images, each of size 28*28 pixels, and if i recall the 1 is an empty dimension (since they are grayscale images) to match the input shape of your neural network.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With