Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spatial pyramid matching (SPM) for SIFT then input to SVM in C++

I am trying to classify MRI images of brain tumors into benign and malignant using C++ and OpenCV. I am planning on using bag-of-words (BoW) method after clustering SIFT descriptors using kmeans. Meaning, I will represent each image as a histogram with the whole "codebook"/dictionary for the x-axis and their occurrence count in the image for the y-axis. These histograms will then be my input for my SVM (with RBF kernel) classifier.

However, the disadvantage of using BoW is that it ignores the spatial information of the descriptors in the image. Someone suggested to use SPM instead. I read about it and came across this link giving the following steps:

  1. Compute K visual words from the training set and map all local features to its visual word.
  2. For each image, initialize K multi-resolution coordinate histograms to zero. Each coordinate histogram consist of L levels and each level i has 4^i cells that evenly partition the current image.
  3. For each local feature (let's say its visual word ID is k) in this image, pick out the k-th coordinate histogram, and then accumulate one count to each of the L corresponding cells in this histogram, according to the coordinate of the local feature. The L cells are cells where the local feature falls in in L different resolutions.
  4. Concatenate the K multi-resolution coordinate histograms to form a final "long" histogram of the image. When concatenating, the k-th histogram is weighted by the probability of the k-th visual word.
  5. To compute the kernel value over two images, sum up all the cells of the intersection of their "long" histograms.

Now, I have the following questions:

  1. What is a coordinate histogram? Doesn't a histogram just show the counts for each grouping in the x-axis? How will it provide information on the coordinates of a point?
  2. How would I compute the probability of the k-th visual word?
  3. What will be the use of the "kernel value" that I will get? How will I use it as input to SVM? If I understand it right, is the kernel value is used in the testing phase and not in the training phase? If yes, then how will I train my SVM?
  4. Or do you think I don't need to burden myself with the spatial info and just stick with normal BoW for my situation(benign and malignant tumors)?

Someone please help this poor little undergraduate. You'll have my forever gratefulness if you do. If you have any clarifications, please don't hesitate to ask.

like image 831
noobalert Avatar asked Sep 28 '22 01:09

noobalert


1 Answers

Here is the link to the actual paper, http://www.csd.uwo.ca/~olga/Courses/Fall2014/CS9840/Papers/lazebnikcvpr06b.pdf

MATLAB code is provided here http://web.engr.illinois.edu/~slazebni/research/SpatialPyramid.zip

Co-ordinate histogram (mentioned in your post) is just a sub-region in the image in which you compute the histogram. These slides explain it visually, http://web.engr.illinois.edu/~slazebni/slides/ima_poster.pdf.

You have multiple histograms here, one for each different region in the image. The probability (or the number of items would depend on the sift points in that sub-region).

I think you need to define your pyramid kernel as mentioned in the slides.

A Convolutional Neural Network may be better suited for your task if you have enough training samples. You can probably have a look at Torch or Caffe.

like image 137
Bharat Avatar answered Sep 29 '22 17:09

Bharat