LMDB files and how they are used for caffe deep learning network

Tags:

I am quite new in deep learning and I am having some problems in using the caffe deep learning network. Basically, I didn't find any documentation explaining how I can solve a series of questions and problems I am dealing right now.

Please, let me explain my situation first.

I have thousands of images and I must do a series of pre-processing operations on them. For each pre-processing operation, I have to save these pre-processed images as 4D matrices and also store a vector with the images labels. I will store this information as LMDB files that will be used as input for the caffe googlenet deep learning.

I tried to save my images as .HD5 files, but the final file size is 80GB, which is impossible to process with the memory I have.

So, the other option is using LMDB files, right? I am quite newbie in this file format and I appreciate your help in understanding how to create them in Matlab. Basically, my rookie questions are:

1- These LMDB files have extension .MDB, right? is this extension the same used by microsoft access? or the right format is .lmdb and they are different?

2- I find this solution for creating .mdb files (https://github.com/kyamagu/matlab-leveldb), does it create the file format needed by caffe?

3- For caffe, should I have to create one .mdb file for labels and other for images or both can be fields of the same .mdb file?

4- When I create an .mdb file I have to label the database fields. Can I label one field as image and other as label? does caffe understand which field means?

5- what does the function (in https://github.com/kyamagu/matlab-leveldb) database.put('key1', 'value1') and database.put('key2', 'value2') do? Should I have to save my 4-d matrices in one field and the label vector in another?

422

asked Jun 22 '15 12:06

mad

1 Answers

There is no connection between LMDB files and MS Access files.

As I see it you have two options:

Use the "convert_imageset" tool - it is located in caffe under the tools folder to convert a list of image files and label to lmdb.
Instead of "data layer" use "image data layer" as an input to the network. This type of layer takes a file with a list of image file names and labels as source so you don't have to build a database (another benefit for training - you can use the shuffle option and get slightly better training results)

In order to use an image data layer just replace the layer type from Data to ImageData. The source file is the path to a file containing in each line a path of an image file and the label seperated by space. For example:

/path/to/filnename.png 23

If you want to do some preprocessing of the data without saving the preprocessed file to disk you can use the transformations available by caffe (mirror and cropping) (see here for information http://caffe.berkeleyvision.org/tutorial/data.html) or implement your own DataTransformer.

answered Sep 26 '22 04:09

Tal Darom

Related questions
                            
                                Using standard io stream:stdin and stdout in a matlab exe
                            
                                Using a colon for indexing in matrices of unknown dimensions
                            
                                Estimating confidence intervals of a Markov transition matrix
                            
                                How to find a unique (non-repeated) value in a matrix by using matlab
                            
                                Is it possible to test a function handle without try block?
                            
                                Multi variable gradient descent in matlab
                            
                                Testing for Unimodal (Unimodality) or Bimodal (Bimodality) Distribution in MATLAB
                            
                                Plot bar in matlab with log-scale x axis and same width
                            
                                Trying to merely simulate the Matlab "unique" function in c++
                            
                                Numpy equivalent of dot(A,B,3)
                            
                                How to remove those rows of matrix A, which have equal values with matrix B in specified columns in Matlab?
                            
                                Why do I get the warning "epstool binary is not available", with saveas(gcf,'filename.pdf')
                            
                                Easy, scriptable way to sub-sample unstructured THREDDS data?
                            
                                How to determine if a function was called followed by a semicolon (";")?
                            
                                Run particular cell section from command line in Matlab?
                            
                                Perl script messes with file descriptor in matlab
                            
                                Indexing a matrix using predetermined rule
                            
                                R equivalent of the Matlab spy function
                            
                                Length of longest repeating string within a longer string
                            
                                find area of 3D polygon

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

LMDB files and how they are used for caffe deep learning network

Tags:

image-processing

matlab

deep-learning

computer-vision

caffe

mad

People also ask

1 Answers

Tal Darom

Recent Activity

Donate For Us