Depth-first search in CUDA / OpenCL

Question

I'm half-way through implementing parallel depth-first search algorithm in MPI and I'm thinking about trying to also do it in CUDA / OpenCL, just for fun / out of curiosity. The algorithm is simple but not trivial. The single-core version in C is about 200 lines of code.

How much is GPGPU suitable for this kind of problem?

phoad · Accepted Answer

Tree search operations are not so simple to be implemented in CUDA. There are some papers, like the one

"Lessons Learned from Exploring the Backtracking Paradigm on the GPU" http://dl.acm.org/citation.cfm?id=2033458

And another rather simple implementation (not quite a massively parallelized implementation in my opinion)

"Accelerating Large Graph Algorithms on the GPU Using CUDA" Pawan Harish and P. J. Narayanan

The difficulty comes from the fact that, tree operations generally involve decision making and according to the decisions different branches are taken. So massively parallelizing the operations without overlapping and making redundant operations is quite hard.

There are some approaches which use Stack and Queue implementations to traverse Trees.

You may find a similar question in here: Error: BFS on CUDA Synchronization

Depth-first search in CUDA / OpenCL

Tags:

parallel-processing

cuda

depth-first-search

gpgpu

opencl

fhucho

1 Answers

phoad

Recent Activity

Donate For Us

Depth-first search in CUDA / OpenCL

Tags:

parallel-processing

cuda

depth-first-search

gpgpu

opencl

fhucho

1 Answers

phoad

Related questions

Recent Activity

Donate For Us