I can't train the SVM to recognize my object. I'm trying to do this using SURF + Bag Of Words + SVM. My problem is that the classifier does not detect anything. All the results are 0.
Here is my code:
Ptr<FeatureDetector> detector = FeatureDetector::create("SURF");
Ptr<DescriptorExtractor> descriptors = DescriptorExtractor::create("SURF");
string to_string(const int val) {
int i = val;
std::string s;
std::stringstream out;
out << i;
s = out.str();
return s;
}
Mat compute_features(Mat image) {
vector<KeyPoint> keypoints;
Mat features;
detector->detect(image, keypoints);
KeyPointsFilter::retainBest(keypoints, 1500);
descriptors->compute(image, keypoints, features);
return features;
}
BOWKMeansTrainer addFeaturesToBOWKMeansTrainer(String dir, BOWKMeansTrainer& bowTrainer) {
DIR *dp;
struct dirent *dirp;
struct stat filestat;
dp = opendir(dir.c_str());
Mat features;
Mat img;
string filepath;
#pragma loop(hint_parallel(4))
for (; (dirp = readdir(dp));) {
filepath = dir + dirp->d_name;
cout << "Reading... " << filepath << endl;
if (stat( filepath.c_str(), &filestat )) continue;
if (S_ISDIR( filestat.st_mode )) continue;
img = imread(filepath, 0);
features = compute_features(img);
bowTrainer.add(features);
}
return bowTrainer;
}
void computeFeaturesWithBow(string dir, Mat& trainingData, Mat& labels, BOWImgDescriptorExtractor& bowDE, int label) {
DIR *dp;
struct dirent *dirp;
struct stat filestat;
dp = opendir(dir.c_str());
vector<KeyPoint> keypoints;
Mat features;
Mat img;
string filepath;
#pragma loop(hint_parallel(4))
for (;(dirp = readdir(dp));) {
filepath = dir + dirp->d_name;
cout << "Reading: " << filepath << endl;
if (stat( filepath.c_str(), &filestat )) continue;
if (S_ISDIR( filestat.st_mode )) continue;
img = imread(filepath, 0);
detector->detect(img, keypoints);
bowDE.compute(img, keypoints, features);
trainingData.push_back(features);
labels.push_back((float) label);
}
cout << string( 100, '\n' );
}
int main() {
initModule_nonfree();
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("FlannBased");
TermCriteria tc(CV_TERMCRIT_ITER + CV_TERMCRIT_EPS, 10, 0.001);
int dictionarySize = 1000;
int retries = 1;
int flags = KMEANS_PP_CENTERS;
BOWKMeansTrainer bowTrainer(dictionarySize, tc, retries, flags);
BOWImgDescriptorExtractor bowDE(descriptors, matcher);
string dir = "./positive_large", filepath;
DIR *dp;
struct dirent *dirp;
struct stat filestat;
cout << "Add Features to KMeans" << endl;
addFeaturesToBOWKMeansTrainer("./positive_large/", bowTrainer);
addFeaturesToBOWKMeansTrainer("./negative_large/", bowTrainer);
cout << endl << "Clustering..." << endl;
Mat dictionary = bowTrainer.cluster();
bowDE.setVocabulary(dictionary);
Mat labels(0, 1, CV_32FC1);
Mat trainingData(0, dictionarySize, CV_32FC1);
cout << endl << "Extract bow features" << endl;
computeFeaturesWithBow("./positive_large/", trainingData, labels, bowDE, 1);
computeFeaturesWithBow("./negative_large/", trainingData, labels, bowDE, 0);
CvSVMParams params;
params.kernel_type=CvSVM::RBF;
params.svm_type=CvSVM::C_SVC;
params.gamma=0.50625000000000009;
params.C=312.50000000000000;
params.term_crit=cvTermCriteria(CV_TERMCRIT_ITER,100,0.000001);
CvSVM svm;
cout << endl << "Begin training" << endl;
bool res=svm.train(trainingData,labels,cv::Mat(),cv::Mat(),params);
svm.save("classifier.xml");
//CvSVM svm;
svm.load("classifier.xml");
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
return -1;
Mat featuresFromCam, grey;
vector<KeyPoint> cameraKeyPoints;
namedWindow("edges",1);
for(;;)
{
Mat frame;
cap >> frame; // get a new frame from camera
cvtColor(frame, grey, CV_BGR2GRAY);
detector->detect(grey, cameraKeyPoints);
bowDE.compute(grey, cameraKeyPoints, featuresFromCam);
cout << svm.predict(featuresFromCam) << endl;
imshow("edges", frame);
if(waitKey(30) >= 0) break;
}
return 0;
}
You should know that I got the parameters from an existing project with a good results, so I thought they'll be useful in my code too (but eventually maybe not).
I have 310 positive images and 508 negative images. I tried to use equal numbers of positive and negative images but the result is the same. The object I want to detect is car steering wheel. Here is my dataset.
Do you have any idea what I'm doing wrong? Thank you!
First of all, using same parameters from an existing project doesn't prove that you are using correct parameters. In fact, in my opinion it is a completely nonsense approach (no offense). It is because, SVM parameters are affected from dataset and decriptor extraction method directly. In order to get correct parameters you have to do cross-validation. So if those parameters are obtained from a different recognition task it won't make any sense. For example in my face verification project optimal parameters were 0.0625 and 10 for gamma
and C
respectively.
Other important issue with your approach is test images. As far as I see from your code, you are not using images from disk to test your classifier, so from rest of here I'll do some assumptions. If your test images, that you are acquired from camera are different from your positive images, it will fail. By different I mean this; you have to be sure that your test images are composed only of steering wheels, because your training images contain only steering wheels. If your test image contains, for instance car seat with it, your BoW descriptor for test image will be completely different from your train images BoW descriptor. So, simply, your test images shouldn't contain steering wheels with some other objects, they should only contain steering wheels.
If you satisfy these, using training images for testing your system is the most basic approach. Even in that scenario you are failing you probably have some implenetation issues. Other approach can be this; split your training data into two, such that you have four partitions:
Use only train images for training the system and test it with the test images. And again, you have to specify parameters via cross-validation.
Other than these, you might want to check some specific steps in order to localize the problem, before doing the previous things that I wrote:
I hope I was able to emphasize the importance of the cross-validation. Do the cross-validation!
Good luck!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With