Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find Imagenet data labels?

I have two questions about how to load Imagenet datas. I downloaded ILSVRC2012 validation sets(Cause training sets are too large) but I have two problems.

  1. I can't understand how can I find out the labels. There are only jpeg files with file names like "ILSVRC2012_val_00000001.JPEG" but there's no labels. How can I find them?

  2. As far as I know, Imagenet uses 224 * 224 pixel image and the problem is just "classification" not "detection", but ILSVRC2012 sets have much more and different pixel sizes. So, how can I get proper boxes for 224 * 224 pixels?

like image 393
Curious_man Avatar asked Nov 22 '16 14:11

Curious_man


People also ask

Where can I find ImageNet?

ImageNet Download: Go to https://www.kaggle.com/c/imagenet-object-localization-challenge and click on the data tab. You can use the Kaggle API to download on a remote computer, or that page to download all the files you want directly. There, they provide both the labels and the image data.

How do I get images from ImageNet?

You can interactively explore available synsets (categories) at http://www.image-net.org/explore, each synset page has a "Downloads" tab where you can download category image URLs. Alternatively, you can use the ImageNet API. You can download image URLs for a particular synset using the synset id or wnid .


2 Answers

  1. You will download three tar archives: one for training data, one for validation data, and one for test data.

    Training data is contained in 1000 folders, one folder per class (each folder should contain 1,300 JPEG images). Validation data is a single folder with 50k JPEG images, look for the corresponding ILSVRC2012_validation_ground_truth.txt file in (as darren1231 mentioned, it needs to be downloaded separately as part of DevKit).

Test data is similar to validation data, but it does not have labels (labels are not provided to you because you need to submit your predicted labels to them, as part of the competition).

  1. ImageNet images have variable resolution, 482x415 on average, and it's up to you how you want to process them to train your model. Most people process it as following: First downsize each image so that its shorter side is 256 pixels. Then crop a random 224x224 patch. Use those patches for training (you will get different crops each epoch). During test, do the same, but extract a center 224x224 patch, and use that for evaluating classification accuracy. Some people also use multiple patches for testing. Again, it's up to you, and you can use higher resolution if you like.
like image 79
MichaelSB Avatar answered Sep 30 '22 16:09

MichaelSB


It's in the Development kit (Task 1 & 2) The filename called "ILSVRC2012_validation_ground_truth.txt"

like image 30
darren1231 Avatar answered Sep 30 '22 16:09

darren1231