I've looked at a few tutorials to crack into Keras for deep learning using Convolutional Neural Networks. In the tutorial (and in Keras' official documentation), the MNIST dataset is loaded like so:
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
However, no explanation is offered as to why we have two tuples of data. My question is: what are x_train
and y_train
and how do they differ from their x_test
and y_test
counterparts?
x_train : The training part of the first sequence ( x ) x_test : The test part of the first sequence ( x ) y_train : The training part of the second sequence ( y ) y_test : The test part of the second sequence ( y )
After executing these Python instructions, we can verify that x_train.shape takes the form of (60000, 784) and x_test.shape takes the form of (10000, 784), where the first dimension indexes the image and the second indexes the pixel in each image (now the intensity of the pixel is a value between 0 and 1): print( ...
x_test is the test data set. y_test is the set of labels to all the data in x_test .
y_test = actual values. y_pred = values you predicted. This means you can evaluate the performance of your model by comparing y_test and y_pred.
The training set is a subset of the data set used to train a model.
x_train
is the training data set. y_train
is the set of labels to all the data in x_train
.The test set is a subset of the data set that you use to test your model after the model has gone through initial vetting by the validation set.
x_test
is the test data set.y_test
is the set of labels to all the data in x_test
.The validation set is a subset of the data set (separate from the training set) that you use to adjust hyperparameters.
I've made a Deep Learning with Keras playlist on Youtube. It contains the basics for getting started with Keras, and a couple of the videos demo how to organize images into train/valid/test sets, as well as how to get Keras to create a validation set for you. Seeing this implementation may help you get a firmer grasp on how these different data sets are used in practice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With