I have this opencv image processing function being called 4x on 4 diferent Mat objects.
void processBinary(Mat& binaryMat) {
//image processing
}
I want to multi-thread it so that all 4 method calls complete at the same time, but have the main thread wait until each thread is done.
Ex:
int main() {
Mat m1, m2, m3, m4;
//perform each of these methods simultaneously, but have main thread wait for all processBinary() calls to finish
processBinary(m1);
processBinary(m2);
processBinary(m3);
processsBinary(m4);
}
What I hope to accomplish is to be able to call processBinary() as many times as I need and have the same efficiency as having the method called only once. I have looked up multithreading, but am a little confused on calling threads and then joining / detaching them. I believe I need to instantiate each thread and then call join() on each thread so that the main thread waits for each to execute, but there doesn't seem to be a significant increase in execution time. Can anyone explain how I should go about multi-threading my program? Thanks!
EDIT: What I have tried:
//this does not significantly increase execution time. However, calling processBinary() only once does.4
thread p1(&Detector::processBinary, *this, std::ref(m1));
thread p2(&Detector::processBinary, *this, std::ref(m2));
thread p3(&Detector::processBinary, *this, std::ref(m3));
thread p4(&Detector::processBinary, *this, std::ref(m4));
p1.join();
p2.join();
p3.join();
p4.join();
If you use OpenCV library beware that it spawns multiple threads for image processing internally. For example, cv. VideoCapture() spawns multiple threads internally.
As long as no writes happen simultaneously to the reads, it is safe to have multiple concurrent reads.
The slick way to achieve this is not to do the thread housekeeping yourself but use a library that provides micro-parallelization.
OpenCV itself uses Intel Thread Building Blocks (TBB) for exactly this task -- running loops in parallel.
In your case, your loop has just four iterations. With C++11, you can write it down very easily using a lambda expression. In your example:
std::vector<cv::Mat> input = { m1, m2, m3, m4; }
tbb::parallel_for(size_t(0), input.size(), size_t(1), [=](size_t i) {
processBinary(input[i]);
});
For this example I took code from here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With