Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it thread-safe to access a Mat with multiple threads in OpenCV?

i want to speedup an algorithm (complete local binary pattern with circle neighbours) for which i iterate trough all pixels and calculate some stuff with it neighbours (so i need neighbour pixel access).

Currently i do this by iterating over all pixels with one thread/process. I want to parallelize this task by dividing the input image into multiple ROIs and calculate each ROI seperatly (with multiple threads).

The Problem here is, that the ROIs are overlapping (because to calculate a pixel, sometimes i need to look at neighbours far away) and its possible that multiple threads accessing Pixel-Data (READING) at same time. Is that a Problem if two or more threads reading same Mat at same Indices at same time?

Is it also a problem, if i write to the same Mat parallel but at different indices?

like image 399
Math Library Avatar asked Aug 08 '19 20:08

Math Library


2 Answers

Generally, parallel reading is not a problem as a cv::Mat is just a nice wrapper around an array, just like std::vector (yes there are differences but I don't see how they would affect the matter of the topic here so I'm going to ignore them). However parallelization doesn't automatically give you a performance boost. There are quite a few things to consider here:

Creating a thread is ressource heavy and can have a large negative impact if the task is relatively short (in terms of computation time) so thread pooling has to be considered.

If you write high performance code (no matter if multi- or single threaded) you should have a grasp of how your hardware works. In this case: memory and CPU. There is a very good talk from Timur Doumler at CppCon 2016 about that topic. This should help you avoiding cache misses.

Also mention worthy is compiler optimization. Turn it on. I know this sounds super obvious but there are a lot of people on SO that ask questions about performance and yet they don't know what compiler optimization is.

Finally, there is the OpenCV Transparent API (TAPI) which basically utilizes the GPU instead of the CPU. Almost all built-in algorithms of OpenCV support the TAPI, you just have to pass a cv::UMat instead of a cv::Mat. Those two types are convertible to each other. However, the conversion is time intensive because a UMat is basically an array on the GPU memory (VRAM), which means it has to be copied each time you convert it. Also accessing the VRAM takes longer than accessing the RAM (for the CPU that is). Though, you have to keep in mind that you cannot access VRAM data with the CPU without copying it to the RAM. This means you cannot iterate over your pixels if you use cv::UMat. It is only possible if you write your own OpenCL or Cuda code so your algorithm can run on the GPU.

In most consumer grade PCs, for sliding window algorithms (basically anything that iterates over the pixels and performs a calculation around each pixel), using the GPU is usually by far the fastest method (but also requires the most effort to implement). Of course this only holds if the data buffer (your image) is large enough to make it worth copying to and from the VRAM.

For parallel writing: it's generally safe as long as you don't have overlapping areas. However, cache misses and false sharing (as pointed out by NathanOliver) are problems to be considered.

like image 113
Timo Avatar answered Sep 30 '22 15:09

Timo


As long as no writes happen simultaneously to the reads, it is safe to have multiple concurrent reads.

That holds for any sane system.

Consider the alternative:

If there was a race condition, it would mean that the memory storing your object gets modified during the read operation. If no memory (storing the object) gets written to during the read, there's no possible interaction between the threads.

Lastly, if you look at the doc,

https://docs.opencv.org/3.1.0/d3/d63/classcv_1_1Mat.html

You'll see two mentions of thread-safety:

Thus, it is safe to operate on the same matrices asynchronously in different threads.

They mention it around ref-counting, performed during matrix assignment. So, at the very least, assigning from the same matrix to two others can be done safely in multiple threads. This pretty much guarantees that simple read access is also thread-safe.

like image 37
Jeffrey Avatar answered Sep 30 '22 16:09

Jeffrey