Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caffe+Opencv without lmdb

When using caffe, to create training dataset containing images we need to create database in special format like lmdb, but there is any option to pass to caffe batch of images as for example vector<cv::Mat> ?

To clarify I'm looking for solution that can handle large amount of images that can't fit into memory (but assume that one training batch (containing for example 50 images) can be stored in memory).

like image 379
mrgloom Avatar asked Oct 16 '15 15:10

mrgloom


1 Answers

Caffe can take many types of inputs, depending upon the input layer that we use. Some of the input methods that are available are:

  1. Data
  2. MemoryData
  3. HDF5Data
  4. ImageData etc.

In the model file, the very first layer that you find will be Layer type: Data, which used lmdb or leveldb as input method. The conversion of a set of images to these databases are pretty easy as Caffe already provides the tools to convert the images.

The Layer type: MemoryData reads data directly from memory, which will be extremely helpful while using camera inputs to be passed as Caffe input during Test phase. Using this layer for training is highly not recommended.

The Layer type: ImageData takes a text file as input. The text file contains all the image names along with their complete path and the class number. Caffe uses OpenCV to read the images in this layer. It also takes care of all the transformations to the image. Thus instead of using OpenCV to read the image and then pass to MemoryData layer, use of ImageData is recommended.

The format of the .txt from which ImageData layer reads the image must be:

/path/to/the/image/imageName.jpg classNumber

Use of LMDB or LevelDB is highly recommended because, ImageData needn't work well if the image path or name contains spaces, or when any of the images are corrupt.

Details of various layers can be found out here.

Memory is allocated in GPU depending upon the model and batch size. If memory overflow occurs, you could try reducing the batch size. Caffe easily handled training the Imagenet database of 1.2million images. Thus with an optimal batch size, the algorithm should work without any issues.

like image 123
Anoop K. Prabhu Avatar answered Nov 13 '22 00:11

Anoop K. Prabhu