PyTorch: Testing with torchvision.datasets.ImageFolder and DataLoader

Tags:

I'm a newbie trying to make this PyTorch CNN work with the Cats&Dogs dataset from kaggle. As there are no targets for the test images, I manually classified some of the test images and put the class in the filename, to be able to test (maybe should have just used some of the train images).

I used the torchvision.datasets.ImageFolder class to load the train and test images. The training seems to work.

But what do I need to do to make the test-routine work? I don't know, how to connect my test_data_loader with the test loop at the bottom, via test_x and test_y.

The Code is based on this MNIST example CNN. There, something like this is used right after the loaders are created. But I failed to rewrite it for my dataset:

test_x = Variable(torch.unsqueeze(test_data.test_data, dim=1), volatile=True).type(torch.FloatTensor)[:2000]/255.   # shape from (2000, 28, 28) to (2000, 1, 28, 28), value in range(0,1)
test_y = test_data.test_labels[:2000]

The Code:

import os
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.utils.data as data
import torchvision
from torchvision import transforms

EPOCHS = 2
BATCH_SIZE = 10
LEARNING_RATE = 0.003
TRAIN_DATA_PATH = "./train_cl/"
TEST_DATA_PATH = "./test_named_cl/"
TRANSFORM_IMG = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(256),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225] )
    ])

train_data = torchvision.datasets.ImageFolder(root=TRAIN_DATA_PATH, transform=TRANSFORM_IMG)
train_data_loader = data.DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True,  num_workers=4)
test_data = torchvision.datasets.ImageFolder(root=TEST_DATA_PATH, transform=TRANSFORM_IMG)
test_data_loader  = data.DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=True, num_workers=4) 

class CNN(nn.Module):
    # omitted...

if __name__ == '__main__':

    print("Number of train samples: ", len(train_data))
    print("Number of test samples: ", len(test_data))
    print("Detected Classes are: ", train_data.class_to_idx) # classes are detected by folder structure

    model = CNN()    
    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)
    loss_func = nn.CrossEntropyLoss()    

    # Training and Testing
    for epoch in range(EPOCHS):        
        for step, (x, y) in enumerate(train_data_loader):
            b_x = Variable(x)   # batch x (image)
            b_y = Variable(y)   # batch y (target)
            output = model(b_x)[0]          
            loss = loss_func(output, b_y)   
            optimizer.zero_grad()           
            loss.backward()                 
            optimizer.step()

            # Test -> this is where I have no clue
            if step % 50 == 0:
                test_x = Variable(test_data_loader)
                test_output, last_layer = model(test_x)
                pred_y = torch.max(test_output, 1)[1].data.squeeze()
                accuracy = sum(pred_y == test_y) / float(test_y.size(0))
                print('Epoch: ', epoch, '| train loss: %.4f' % loss.data[0], '| test accuracy: %.2f' % accuracy)

504

asked Mar 02 '18 16:03

kett

1 Answers

Looking at the data from Kaggle and your code, it seems that there are problems in your data loading, both train and test set. First of all, the data should be in a different folder per label for the default PyTorch ImageFolder to load it correctly. In your case, since all the training data is in the same folder, PyTorch is loading it as one class and hence learning seems to be working. You can correct this by using a folder structure like - train/dog, - train/cat, - test/dog, - test/cat and then passing the train and the test folder to the train and test ImageFolder respectively. The training code seems fine, just change the folder structure and you should be good. Take a look at the official documentation of ImageFolder which has a similar example.

111

answered Sep 16 '22 17:09

Monster

Related questions
                            
                                How to change the temperature of a softmax output in Keras
                            
                                How to connect HBase and Spark using Python?
                            
                                np_utils.to_categorical Reverse
                            
                                Python Matplotlib FuncAnimation.save() only saves 100 frames
                            
                                How to boost a Keras based neural network using AdaBoost?
                            
                                Python error: "socket.error: [Errno 11] Resource temporarily unavailable" when sending image
                            
                                Pandas: create dataframe without auto ordering column names alphabetically
                            
                                Sequentially read huge CSV file in python
                            
                                Pandas missing x tick labels [duplicate]
                            
                                Generate sql with subquery as a column in select statement using SQLAlchemy
                            
                                What is the explicit python3 type for dict_keys for isinstance() check?
                            
                                what does `yield from asyncio.sleep(delay)` do?
                            
                                how to get the name of column with maximum value in pyspark dataframe
                            
                                Swapping rows within the same pandas dataframe
                            
                                Why is my Protobuf message (in Python) ignoring zero values?
                            
                                Scatter plot with colormap makes X-axis disappear
                            
                                Efficiently download files asynchronously with requests
                            
                                Django REST: Uploading and serializing multiple images
                            
                                Python splitting list to sublists at given start/end keywords
                            
                                How to run a cron job with pipenv?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

PyTorch: Testing with torchvision.datasets.ImageFolder and DataLoader

Tags:

python

pytorch

kett

People also ask

1 Answers

Monster

Recent Activity

Donate For Us