grid_group not found in CUDA 9

Question

I tried using Cooperative Groups in CUDA 9, but I get an error in compiling.
Does anyone know the solution?

The development environment is as follows:

CUDA 9
Kepler K80
Compute Capability: 3.7

#include <cstdint>
#include <iostream>
#include <vector>

#include <cooperative_groups.h>

__global__
void kernel(uint32_t values[])
{
    using namespace cooperative_groups;

    grid_group g = this_grid();
}

int main(void)
{
    constexpr uint32_t kNum = 1 << 24; 
    std::vector<uint32_t> h_values(kNum);
    uint32_t *d_values;

    cudaMalloc(&d_values, sizeof(uint32_t) * kNum);
    cudaMemcpy(d_values, h_values.data(), sizeof(uint32_t) * kNum, cudaMemcpyHostToDevice);

    const uint32_t thread_num = 256;
    const dim3 block(thread_num);
    const dim3 grid((kNum + block.x - 1) / block.x);
    void *params[] = {&d_values};

    cudaLaunchCooperativeKernel((void *)kernel, grid, block, params);

    cudaMemcpy(h_values.data(), d_values, sizeof(uint32_t) * kNum, cudaMemcpyDeviceToHost);

    cudaFree(d_values);

    return 0;
}

$ nvcc -arch=sm_37 test.cu --std=c++11 -o test
test.cu(12): error: identifier "grid_group" is undefined
test.cu(12): error: identifier "this_grid" is undefined

havogt · Accepted Answer

The grid_group features are only supported in the Pascal architecture and later. You can try by compiling for, e.g., sm_60 (of course the executable won't run on your GPU). Additionally you need to enable relocatable device code (-rdc=true).

Unfortunately, the Programming Guide is not very clear about that. I couldn't find this information there. However it is mentioned in some posts on devblog.nvidia.com:

From https://devblogs.nvidia.com/cuda-9-features-revealed/

While Cooperative Groups works on all GPU architectures, certain functionality is inevitably architecture-dependent as GPU capabilities have evolved. Basic functionality, such as synchronizing groups smaller than a thread block down to warp granularity, is supported on all architectures, while Pascal and Volta GPUs enable new grid-wide and multi-GPU synchronizing groups.

Or at the very end of https://devblogs.nvidia.com/cooperative-groups/

New features in Pascal and Volta GPUs help Cooperative Groups go farther, by enabling creation and synchronization of thread groups that span an entire kernel launch running on one or even multiple GPUs.

grid_group not found in CUDA 9

Tags:

cuda

kotatsu

1 Answers

havogt

Recent Activity

Donate For Us

grid_group not found in CUDA 9

Tags:

cuda

kotatsu

1 Answers

havogt

Related questions

Recent Activity

Donate For Us