Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a linked list using CUDA

Is it possible to create a linked list on a GPU using CUDA?
I am trying to do this and I am encoutering some difficulties.
If I can't allocate dynamic memory in a CUDA kernel, then how can I create a new node and add it to the linked list?

like image 203
scatman Avatar asked Oct 20 '10 06:10

scatman


2 Answers

You really don't want to do this if you can help it - the best thing you can do if you can't get away from linked lists is to emulate them via arrays and use array indices rather than pointers for your links.

like image 106
Paul R Avatar answered Sep 18 '22 02:09

Paul R


There are some valid use cases for linked lists on a GPU. Consider using a Skip List as an alternative as they provide faster operations. There are examples of highly concurrent Skip List algorithms available via Google searches.

Check out this link http://www.cse.iitk.ac.in/users/mainakc/lockfree.html/ for CUDA code a PDF and PPT presentation on a number of lock free CUDA data structures.

Link Lists can be constructed in parallel using a reduction algorithm approach. This assumes that ALL members are known at construction time. Each thread starts by connecting 2 nodes. Then half the threads connect the 2 node segments together and so on, reducing the number threads by 2 each iteration. This will build a list in log2 N time.

Memory allocation is a constraint. Pre-allocate all the nodes in an array on the host. Then you can use array subscripts in place of pointers. That has the advantage that the list traversal is valid on the GPU and the host.

For concurrency you need to use CUDA atomic operations. Atomic add/increment to count the nodes used from the node array and Compare and Swap to to set the links between nodes.

Again carefully consider the use case and access patterns. Using one large linked list is very serial. Using 100 - 100's of small Linked list is more parallel. I expect the memory access be uncoalesced unless care is taken to allocate connected nodes in adjacent memory locations.

like image 35
Tim Child Avatar answered Sep 19 '22 02:09

Tim Child