Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using thrust::sort inside a thread

Tags:

cuda

thrust

I would like to know if thrust::sort() can be used inside a thread

__global__
void mykernel(float* array, int arrayLength)
{
    int threadID = blockIdx.x * blockDim.x + threadIdx.x;
    // array length is vector in the device global memory
    // is it possible to use inside the thread?
    thrust::sort(array, array+arrayLength);
    // do something else with the array
}

If yes, does the sort launch other kernels to parallelize the sort?

like image 969
Boraxis Avatar asked May 01 '14 07:05

Boraxis


2 Answers

Yes, thrust::sort can be combined with the thrust::seq execution policy to sort numbers sequentially within a single CUDA thread (or sequentially within a single CPU thread):

#include <thrust/sort.h>
#include <thrust/execution_policy.h>

__global__
void mykernel(float* array, int arrayLength)
{
  int threadID = blockIdx.x * blockDim.x + threadIdx.x;

  // each thread sorts array
  // XXX note this causes a data race
  thrust::sort(thrust::seq, array, array + arrayLength);
}

Note that your example causes a data race because each CUDA thread attempts to sort the same data in parallel. A correct race-free program would partition array according to thread index.

The thrust::seq execution policy, which is required for this feature, is only available in Thrust v1.8 or better.

like image 105
Jared Hoberock Avatar answered Sep 22 '22 22:09

Jared Hoberock


@aland already referred you to an earlier answer about calling Thrust's parallel algorithms on the GPU - in that case the asker was simply trying to sort data which was already on the GPU; Thrust called from the CPU can handle GPU-resident data by cast pointers to vectors.

Assuming your question is different and you really want to call a parallel sort in the middle of your kernel (as opposed to break the kernel into multiple smaller kernels and call sort in between) then you should consider CUB, which provides a variety of primitives suitable for your purposes.

Update: Also see @Jared's answer in which he explains that you can call Thrust's sequential algorithms from on the GPU as of Thrust 1.7.

like image 29
Tom Avatar answered Sep 19 '22 22:09

Tom