Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

filter class/subfolder with pytorch ImageFolder

Tags:

python

pytorch

Here's my folder structure

image-folders/
   ├── class_0/
   |   ├── 001.jpg
   |   ├── 002.jpg
   └── class_1/
   |   ├── 001.jpg
   |   └── 002.jpg
   └── class_2/
       ├── 001.jpg
       └── 002.jpg

By using ImageFolder from torchvision, I can create dataset with this syntax :
dataset = ImageFolder("image-folders",...)

But this will read the entire subfolder and create 3 target classes. I don't want to include the class_2 folder, I want my dataset to only contains class_0 and class_1 only, is there any way to achieve this besides delete/move the class_2 folder?

like image 570
Vinson Ciawandy Avatar asked Apr 07 '21 04:04

Vinson Ciawandy


People also ask

What is the PyTorch imagefolder class?

Let’s go over the PyTorch ImageFolder class in brief. It’s a fairly easy concept to grasp. module. We can easily access it using the following syntax: This class helps us to easily create PyTorch training and validation datasets without writing custom classes. Then we can use these datasets to create our iterable data loaders. class.

How do I load image data with PyTorch?

When it comes to loading image data with PyTorch, the ImageFolder class works very nicely, and if you are planning on collecting the image data yourself, I would suggest organizing the data so it can be easily accessed using the ImageFolder class. However, life isn’t always easy.

What does imagefolder expect from the root folder?

Show activity on this post. ImageFolder expects the data folder (the one that you pass as root) to contain subfolders representing the classes to which its images belong. Something like this:

What are the training and validation functions in PyTorch?

The training and validation functions are going to be pretty simple. Let’s take a look. functions contain pretty standard code for what we generally write in PyTorch for image classification. In both cases, we are returning the loss and accuracy values after each epoch. The last thing we need to start the training is the training loop.


1 Answers

You can do this by using torch.utils.data.Subset of the original full ImageFolder dataset:

from torchvision.datasets import ImageFolder
from torch.utils.data import Subset

# construct the full dataset
dataset = ImageFolder("image-folders",...)
# select the indices of all other folders
idx = [i for i in range(len(dataset)) if dataset.imgs[i][1] != dataset.class_to_idx['class_s']]
# build the appropriate subset
subset = Subset(dataset, idx)
like image 154
Shai Avatar answered Sep 20 '22 05:09

Shai