Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pytorch ImageNet dataset

I am unable to download the original ImageNet dataset from their official website. However, I found out that pytorch has ImageNet as one of it’s torch vision datasets.

Q1. Is that the original ImageNet dataset?

Q2. How do I get the classes for the dataset like it’s being done in Cifar-10

classes = [‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’]
like image 250
reginald Avatar asked Mar 09 '20 20:03

reginald


People also ask

How do I get dataset from ImageNet?

ImageNet Download: Go to https://www.kaggle.com/c/imagenet-object-localization-challenge and click on the data tab. You can use the Kaggle API to download on a remote computer, or that page to download all the files you want directly. There, they provide both the labels and the image data.

What is the size of ImageNet dataset?

The ImageNet dataset consists of three parts, training data, validation data, and image labels. The training data contains 1000 categories and 1.2 million images, packaged for easy downloading. The validation and test data are not contained in the ImageNet training data (duplicates have been removed).


Video Answer


1 Answers

The torchvision.datasets.ImageNet is just a class which allows you to work with the ImageNet dataset. You have to download the dataset yourself (e.g. from http://image-net.org/download-images) and pass the path to it as the root argument to the ImageNet class object.

Note that the option to download it directly by passing the flag download=True is no longer possible:

if download is True:
    msg = ("The dataset is no longer publicly accessible. You need to "
           "download the archives externally and place them in the root "
           "directory.")
    raise RuntimeError(msg)
elif download is False:
    msg = ("The use of the download flag is deprecated, since the dataset "
           "is no longer publicly accessible.")
    warnings.warn(msg, RuntimeWarning)

(source)

If you just need to get the class names and the corresponding indices without downloading the whole dataset (e.g. if you are using a pretrained model and want to map the predictions to labels), then you can download them e.g. from here or from this github gist.

like image 145
Andreas K. Avatar answered Oct 29 '22 09:10

Andreas K.