Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are GPUs good for case-based image filtering?

I am trying to figure out whether a certain problem is a good candidate for using CUDA to put the problem on a GPU.

I am essentially doing a box filter that changes based on some edge detection. So there are basically 8 cases that are tested for for each pixel, and then the rest of the operations happen - typical mean calculations and such. Is the presence of these switch statements in my loop going to cause this problem to be a bad candidate to go to GPU?

I am not sure really how to avoid the switch statements, because this edge detection has to happen at every pixel. I suppose the entire image could have the edge detection part split out from the processing algorithm, and you could store a buffer corresponding to which filter to use for each pixel, but that seems like it would add a lot of pre-processing to the algorithm.

Edit: Just to give some context - this algorithm is already written, and OpenMP has been used to pretty good effect at speeding it up. However, the 8 cores on my development box pales in comparison to the 512 in the GPU.

like image 978
Derek Avatar asked Jan 14 '11 15:01

Derek


2 Answers

Edge detection, mean calculations and cross-correlation can be implemented as 2D convolutions. Convolutions can be implemented on GPU very effectively (speed-up > 10, up to 100 with respect to CPU), especially for large kernels. So yes, it may make sense rewriting image filtering on GPU.

Though I wouldn't use GPU as a development platform for such a method.

like image 173
sastanin Avatar answered Sep 29 '22 12:09

sastanin


typically, unless you are on the new CUDA architecture, you will want to avoid branching. because GPUs are basically SIMD machines, the pipleline is extremely vurnurable to, and suffers tremendously from, pipeline stalls due to branch misprediction.

if you think that there are significant benefits to be garnered by using a GPU, do some preliminary benchmarks to get a rough idea.

if you want to learn a bit about how to write non-branching code, head over to http://cellperformance.beyond3d.com/ and have a look.

further, investigating into running this problem on multiple CPU cores might also be worth it, in which case you will probably want to look into either OpenCL or the Intel performance libraries (such as TBB)

another go-to source for problems targeting the GPU be it graphics, computational geometry or otherwise, is IDAV, the Institute for Data Analysis and Visualization: http://idav.ucdavis.edu

like image 29
awdz9nld Avatar answered Sep 29 '22 13:09

awdz9nld