I have a simple algorithm which converts a bayer image channel (BGGR,RGGB,GBRG,GRBG) to rgb (demosaicing but without neighbors). In my implementation I have pre-set offset vectors which help me to translate the bayer channel index to its corresponding rgb channel indices. Only problem is I'm getting awful performance in debug mode with MSVC11. Under release, for an input of 3264X2540 size the function completes in ~60ms. For the same input in debug, the function completes in ~20,000ms. That's more than X300 difference and since some developers are runnig my application in debug, it's unacceptable.
My code:
void ConvertBayerToRgbImageDemosaic(int* BayerChannel, int* RgbChannel, int Width, int
Height, ColorSpace ColorSpace)
{
int rgbOffsets[4]; //translates color location in Bayer block to it's location in RGB block. So R->0, G->1, B->2
std::vector<int> bayerToRgbOffsets[4]; //the offsets from every color in the Bayer block to (bayer) indices it will be copied to (R,B are copied to all indices, Gr to R and Gb to B).
//calculate offsets according to color space
switch (ColorSpace)
{
case ColorSpace::BGGR:
/*
B G
G R
*/
rgbOffsets[0] = 2; //B->0
rgbOffsets[1] = 1; //G->1
rgbOffsets[2] = 1; //G->1
rgbOffsets[3] = 0; //R->0
//B is copied to every pixel in it's block
bayerToRgbOffsets[0].push_back(0);
bayerToRgbOffsets[0].push_back(1);
bayerToRgbOffsets[0].push_back(Width);
bayerToRgbOffsets[0].push_back(Width + 1);
//Gb is copied to it's neighbouring B
bayerToRgbOffsets[1].push_back(-1);
bayerToRgbOffsets[1].push_back(0);
//GR is copied to it's neighbouring R
bayerToRgbOffsets[2].push_back(0);
bayerToRgbOffsets[2].push_back(1);
//R is copied to every pixel in it's block
bayerToRgbOffsets[3].push_back(-Width - 1);
bayerToRgbOffsets[3].push_back(-Width);
bayerToRgbOffsets[3].push_back(-1);
bayerToRgbOffsets[3].push_back(0);
break;
... other color spaces
}
for (auto row = 0; row < Height; row++)
{
for (auto col = 0, bayerIndex = row * Width; col < Width; col++, bayerIndex++)
{
auto colorIndex = (row%2)*2 + (col%2); //0...3, For example in BGGR: 0->B, 1->Gb, 2->Gr, 3->R
//iteration over bayerToRgbOffsets is O(1) since it is either sized 2 or 4.
std::for_each(bayerToRgbOffsets[colorIndex].begin(), bayerToRgbOffsets[colorIndex].end(),
[&](int colorOffset)
{
auto rgbIndex = (bayerIndex + colorOffset) * 3 + rgbOffsets[offset];
RgbChannel[rgbIndex] = BayerChannel[bayerIndex];
});
}
}
}
What I've tried:
I tried turing on optimization (/O2) for the debug build with no significant differences.
I tried replacing the inner for_each
statement with a plain old for
loop but to no avail. I have a very similar algorithm which converts bayer to "green" rgb (without copying the data to neighboring pixels in the block) in which I'm not using the std::vector
and there there is the expected runtime difference between debug and release (X2-X3). So, could the std::vector
be the problem? If so, how do I overcome it?
Lots of your code could be completely removed or rewritten in Release mode. The resulting executable will most likely not match up with your written code. Because of this release mode will run faster than debug mode due to the optimizations.
But the reality is debug build is much faster than release build. The release build normally takes more than 0.30ms, while the debug build takes under 0.3.
One of the main reasons that the debug version is significantly slower is because of these extra diagnostics. as to why you want to run in Debug, it's because those extra diagnostics are doing lots of useful stuff that help you catch bugs in your program so that you have more chance of the release build working.
It's because the optimizer schedules registers completely differently, trying to make code run fast, while the debug compiler tries to preserve values of temporary variables so you can read them from the debugger.
As you use std::vector
, It will help to disable iterator debugging.
MSDN shows how to do it.
In simple terms, make this #define
before you include any STL headers:
#define _HAS_ITERATOR_DEBUGGING 0
In my experience, this gives a major boost in performance of Debug builds, although of course you do lose some Debugging functionality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With