Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split data into train and test sets using torchvision.datasets.Imagefolder?

In my custom dataset, one kind of image is in one folder which torchvision.datasets.Imagefolder can handle, but how to split the dataset into train and test?

like image 619
toy_programmer Avatar asked Jul 29 '19 02:07

toy_programmer


People also ask

How do you split dataset into test and validation?

Split the dataset We can use the train_test_split to first make the split on the original dataset. Then, to get the validation set, we can apply the same function to the train set to get the validation set. In the function below, the test set size is the ratio of the original data we want to use as the test set.

Why do we split dataset into training and testing dataset?

Separating data into training and testing sets is an important part of evaluating data mining models. Typically, when you separate a data set into a training set and testing set, most of the data is used for training, and a smaller portion of the data is used for testing.


1 Answers

You can use torch.utils.data.Subset to split your ImageFolder dataset into train and test based on indices of the examples.
For example:

orig_set = torchvision.datasets.Imagefolder(...)  # your dataset
n = len(orig_set)  # total number of examples
n_test = int(0.1 * n)  # take ~10% for test
test_set = torch.utils.data.Subset(orig_set, range(n_test))  # take first 10%
train_set = torch.utils.data.Subset(orig_set, range(n_test, n))  # take the rest   
like image 102
Shai Avatar answered Oct 07 '22 19:10

Shai