Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Large Scale Image Classifier

I have a large set of plant images labeled with the botanical name. What would be the best algorithm to use to train on this dataset in order to classify an unlabel photo? The photos are processed so that 100% of the pixels contain the plant (e.g. either closeups of the leaves or bark), so there are no other objects/empty-space/background that the algorithm would have to filter out.

I've already tried generating SIFT features for all the photos and feeding these (feature,label) pairs to a LibLinear SVM, but the accuracy was a miserable 6%.

I also tried feeding this same data to a few Weka classifiers. The accuracy was a little better (25% with Logistic, 18% with IBk), but Weka's not designed for scalability (it loads everything into memory). Since the SIFT feature dataset is a several million rows, I could only test Weka with a random 3% slice, so it's probably not representative.

EDIT: Some sample images:

Pachira aquaticaFagus grandifolia

like image 464
Cerin Avatar asked Apr 18 '11 12:04

Cerin


People also ask

Which classifier is best for image classification?

Convolutional Neural Networks (CNNs) is the most popular neural network model being used for image classification problem.

Which CNN model is best for image classification?

VGG16 is a pre-trained CNN model which is used for image classification. It is trained on a large and varied dataset and fine-tuned to fit image classification datasets with ease.

Which is the best Pretrained model for image classification?

VGG-16 is a popular image classification model developed at the University of Oxford. Though it rolled out at the famous ILSVRC 2014 Conference, the model remains unbeatable today. Back then, VGG-16 made it to the top of the standard of AlexNet and won the classification challenge at an accuracy level of 92.7 percent.


4 Answers

Normally, you would not train on the SIFT features directly. Cluster them (using k-means) and then train on the histogram of cluster membership identifiers (i.e., a k-dimensional vector, which counts, at position i, how many features were assigned to the i-th cluster).

This way, you obtain a single output per image (and a single, k-dimensional, feature vector).

Here's the quasi-code (using mahotas and milk in Pythonn):

from mahotas.surf import surf
from milk.unsupervised.kmeans import kmeans,assign_centroids
import milk

# First load your data:
images = ...
labels = ...

local_features = [surfs(im, 6, 4, 2) for im in imgs]
allfeatures = np.concatenate(local_features)
_, centroids = kmeans(allfeatures, k=100)
histograms = []
for ls in local_features:
     hist = assign_centroids(ls, centroids, histogram=True)
     histograms.append(hist)

cmatrix, _ = milk.nfoldcrossvalidation(histograms, labels)
print "Accuracy:", (100*cmatrix.trace())/cmatrix.sum()
like image 163
luispedro Avatar answered Nov 15 '22 03:11

luispedro


This is a fairly hard problem.

You can give BoW model a try.

Basically, you extract SIFT features on all the images, then use K-means to cluster the features into visual words. After that, use the BoW vector to train you classifiers.

See the Wikipedia article above and the references papers in that for more details.

like image 44
Zifei Tong Avatar answered Nov 15 '22 05:11

Zifei Tong


You probably need better alignment, and probably not more features. There is no way you can get acceptable performance unless you have correspondences. You need to know what points in one leaf correspond to points on another leaf. This is one of the "holy grail" problems in computer vision.

People have used shape context for this problem. You should probably look at this link. This paper describes the basic system behind leafsnap.

like image 22
carlosdc Avatar answered Nov 15 '22 05:11

carlosdc


You can implement the BoW model according to this Bag-of-Features Descriptor on SIFT Features with OpenCV. It is a very good tutorial to implement the BoW model in OpenCV.

like image 1
Hosein Bitaraf Avatar answered Nov 15 '22 05:11

Hosein Bitaraf