For the following code, here is a bit of context.
Mat img0; // 1280x960 grayscale
--
timer.start();
for (int i = 0; i < img0.rows; i++)
{
vector<double> v;
uchar* p = img0.ptr<uchar>(i);
for (int j = 0; j < img0.cols; ++j)
{
v.push_back(p[j]);
}
}
cout << "Single thread " << timer.end() << endl;
and
timer.start();
concurrency::parallel_for(0, img0.rows, [&img0](int i) {
vector<double> v;
uchar* p = img0.ptr<uchar>(i);
for (int j = 0; j < img0.cols; ++j)
{
v.push_back(p[j]);
}
});
cout << "Multi thread " << timer.end() << endl;
The result:
Single thread 0.0458856
Multi thread 0.0329856
The speedup is hardly noticeable.
My processor is Intel i5 3.10 GHz
RAM 8 GB DDR3
EDIT
I tried also a slightly different approach.
vector<Mat> imgs = split(img0, 2,1); // `split` is my custom function that, in this case, splits `img0` into two images, its left and right half
--
timer.start();
concurrency::parallel_for(0, (int)imgs.size(), [imgs](int i) {
Mat img = imgs[i];
vector<double> v;
for (int row = 0; row < img.rows; row++)
{
uchar* p = img.ptr<uchar>(row);
for (int col = 0; col < img.cols; ++col)
{
v.push_back(p[col]);
}
}
});
cout << " Multi thread Sectored " << timer.end() << endl;
And I get much better result:
Multi thread Sectored 0.0232881
So, it looks like I was creating 960 threads or something when I ran
parallel_for(0, img0.rows, ...
And that didn't work well.
(I must add that Kenney's comment is correct. Do not put too much relevance to the specific numbers I stated here. When measuring small intervals such as these, there are high variations. But in general, what I wrote in the edit, about splitting the image in half, improved performance in comparison to old approach.)
I think your problem is that you are limited by memory bandwidth. Your second snippet is basically reading from the whole of the image, and that has got to come out of main memory into cache. (Or out of L2 cache into L1 cache).
You need to arrange your code so that all four cores are working on the same bit of memory at once (I presume you are not actually trying to optimize this code - it is just a simple example).
Edit: Insert crucial "not" in last parenthetical remark.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With