I'm using SIFT feature detector in OpenCV 4.5.2. By tuning the nOctaveLayers parameter in cv::SIFT::create(), I get these results from detectAndCompute():
| nOctaveLayers | KeyPoints | Time Cost (ms) |
|---|---|---|
| 1 | 1026 | 63.41 |
| 2 | 1795 | 45.07 |
| 3 | 2043 | 45.74 |
| 4 | 2173 | 47.83 |
| 5 | 2224 | 51.86 |
To my understanding, there should be less computation with fewer octave layers, but why SIFT costs significantly more time with only 1 octave layer?
I also tested detect() and compute() separately, and they both cost more time when nOctaveLayers is 1, which confuses me a lot.
The test image is here (from TUM open dataset). Thanks ahead for any help.
[Edit for @Micka] My test code:
const int test_num = 100;
const int layers = 5;
cout << "layers: " << layers << endl;
auto sift = SIFT::create(0, layers);
vector<KeyPoint> kps;
Mat descs;
auto t1 = chrono::high_resolution_clock::now();
for (int i = 0; i < test_num; ++i)
sift->detectAndCompute(img_src, noArray(), kps, descs);
auto t2 = chrono::high_resolution_clock::now();
cout << "num of kps: " << kps.size() << endl;
cout << "avg time cost: " << chrono::duration<double>(t2 - t1).count() * 1e3 / test_num << endl;
For each nOctaveLayers configuration, I change layers value in the code, recompile & run & record the result.
After hours of profiling, I finally found out the reason: GaussianBlur.
The pipeline of SIFT algorithm is:
float, double the resolution, and do GaussianBlur (sigma=1.56)The num of octaves is calculated according to image resolution (see here). And nOctaveLayers controls num of layers (nOctaveLayers + 3 for GaussianPyramid) in each octave.
Indeed, when nOctaveLayers increases, nums of layers and keypoints both increase. As a result, time cost of step 3 & 4 increases. However, in parallel computation, this time increment is not very remarkable (several milliseconds).
In contrast, the step 2 costs more than half of the total time. It costs 25.27 ms (in 43.49 ms) when nOctaveLayers is 3, and 51.16 ms (in 63.10 ms) when nOctaveLayers is 1. So, why is this happening?
Because the sigma for GaussianBlur() increases faster when layers are fewer, and it's critical for the time consumed by GaussianBlur(). See test below:
vector<double> sig1 = { 1.6, 2.77128, 5.54256, 11.0851 };
vector<double> sig3 = { 1.6, 1.22627, 1.54501, 1.94659, 2.45255, 3.09002 };
vector<double> sig5 = { 1.6, 0.9044, 1.03888, 1.19336, 1.37081, 1.57465, 1.8088, 2.07777 };
auto blurTest = [](const vector<double>& sigs, const string& label) {
const int test_num = 100;
auto t1 = chrono::high_resolution_clock::now();
for (int i = 0; i < test_num; ++i) {
vector<Mat> pyr;
pyr.resize(sigs.size());
pyr[0] = Mat::zeros(960, 1280, CV_32FC1);
for (size_t i = 1; i < sigs.size(); ++i)
GaussianBlur(pyr[i - 1], pyr[i], Size(), sigs[i], sigs[i]);
}
auto t2 = chrono::high_resolution_clock::now();
auto time = chrono::duration<double>(t2 - t1).count() * 1e3 / test_num;
cout << label << ": " << time << " ms\n";
};
blurTest(sig1, "1");
blurTest(sig3, "3");
blurTest(sig5, "5");
/* output:
1: 45.3958 ms
3: 28.5943 ms
5: 31.4827 ms
*/
The code above simulates buildGaussianPyramid() when nOctaveLayers is 1, 3, and 5. The sigma values are from cv::SIFT calculation. This explains why SIFT costs much more time when nOctaveLayers is 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With