Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deep learning for image classification [closed]

After reading a few papers on deep learning and deep belief networks, I got a basic idea of how it works. But still stuck with the last step, i.e, the classification step. Most of the implementation I found on the Internet deal with generation. (MNIST digits)

Is there some explanation (or code) available somewhere that talk about classifying images(preferably natural images or objects) using DBNs?

Also some pointers in the direction would be really helpful.

like image 310
Nihar Sarangi Avatar asked Feb 17 '13 18:02

Nihar Sarangi


2 Answers

The basic idea

These days, the state-of-the-art deep learning for image classification problems (e.g. ImageNet) are usually "deep convolutional neural networks" (Deep ConvNets). They look roughly like this ConvNet configuration by Krizhevsky et al: enter image description here

For the inference (classification), you feed an image into the left side (notice that the depth on the left side is 3, for RGB), crunch through a series of convolution filters, and it spits out a 1000-dimensional vector on the right-hand side. This picture is especially for ImageNet, which focuses on classifying 1000 categories of images, so the 1000d vector is "score of how likely it is that this image fits in the category."

Training the neural net is only slightly more complex. For training, you basically run classification repeatedly, and every so often you do backpropagation (see Andrew Ng's lectures) to improve the convolution filters in the network. Basically, backpropagation asks "what did the network classify correctly/incorrectly? For misclassified stuff, let's fix the network a little bit."


Implementation

Caffe is a very fast open-source implementation (faster than cuda-convnet from Krizhevsky et al) of deep convolutional neural networks. The Caffe code is pretty easy to read; there's basically one C++ file per type of network layer (e.g. convolutional layers, max-pooling layers, etc).

like image 90
solvingPuzzles Avatar answered Sep 19 '22 20:09

solvingPuzzles


You should use a softmax layer (http://en.wikipedia.org/wiki/Softmax_activation_function) on top of the network you have used for generation, and use backpropagation to fine tune the final network.

like image 23
elaRosca Avatar answered Sep 23 '22 20:09

elaRosca