Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Depth-first search in CUDA / OpenCL

I'm half-way through implementing parallel depth-first search algorithm in MPI and I'm thinking about trying to also do it in CUDA / OpenCL, just for fun / out of curiosity. The algorithm is simple but not trivial. The single-core version in C is about 200 lines of code.

How much is GPGPU suitable for this kind of problem?

like image 235
fhucho Avatar asked Oct 01 '12 10:10

fhucho


1 Answers

Tree search operations are not so simple to be implemented in CUDA. There are some papers, like the one

  • "Lessons Learned from Exploring the Backtracking Paradigm on the GPU" http://dl.acm.org/citation.cfm?id=2033458

And another rather simple implementation (not quite a massively parallelized implementation in my opinion)

  • "Accelerating Large Graph Algorithms on the GPU Using CUDA" Pawan Harish and P. J. Narayanan

The difficulty comes from the fact that, tree operations generally involve decision making and according to the decisions different branches are taken. So massively parallelizing the operations without overlapping and making redundant operations is quite hard.

There are some approaches which use Stack and Queue implementations to traverse Trees.

You may find a similar question in here: Error: BFS on CUDA Synchronization

like image 84
phoad Avatar answered Sep 19 '22 12:09

phoad