In my custom dataset, one kind of image is in one folder which torchvision.datasets.Imagefolder can handle, but how to split the dataset into train and test?
Split the dataset We can use the train_test_split to first make the split on the original dataset. Then, to get the validation set, we can apply the same function to the train set to get the validation set. In the function below, the test set size is the ratio of the original data we want to use as the test set.
Separating data into training and testing sets is an important part of evaluating data mining models. Typically, when you separate a data set into a training set and testing set, most of the data is used for training, and a smaller portion of the data is used for testing.
You can use torch.utils.data.Subset
to split your ImageFolder
dataset into train and test based on indices of the examples.
For example:
orig_set = torchvision.datasets.Imagefolder(...) # your dataset
n = len(orig_set) # total number of examples
n_test = int(0.1 * n) # take ~10% for test
test_set = torch.utils.data.Subset(orig_set, range(n_test)) # take first 10%
train_set = torch.utils.data.Subset(orig_set, range(n_test, n)) # take the rest
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With