Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CUDA: illegal combination of memory qualifiers

Tags:

c++

cuda

I have the following code:

main.cu:

#include "class.h"
int main () {}

class.h:

class Class {
    __global__
    void Function() {};
};

When I compile this code using the command nvcc -c main.cu -o main.o, I get the following errors:

class.h(3): warning: inline qualifier ignored for "global" function
class.h(3): error: illegal combination of memory qualifiers

I have a question about each of these errors. Why does it "ignore" the __global__ qualifier for the function, and why is the __global__ memory qualifier illegal in this context? I have read in the documentation that

E.2.10.2. Function Members
Static member functions cannot be __global__ functions.

However, my function is not a static member, as far as I know. Removing the __global__ line allows it to compile, and so does moving the __global__ and void Function(); lines into main.cu. If this actually ISN'T allowed, why does CUDA force this limitation, and what is a way to get around this while still maintaining structured code?

To clarify, I know no other way to make classes that have functions which can create GPU kernels. It seems to me like kernels can only be created from global functions in main.cu. I am fairly new to CUDA programming, so I may just be missing some CUDA conventions which may have been unclear to me. If this is the case, then please let me know so I can keep up with proper programming practice.

like image 500
Simon Ewing Avatar asked Nov 12 '16 02:11

Simon Ewing


People also ask

Why does the error “illegal combination of memory qualifiers” appear?

When compiling the code below using nvcc (CUDA 5.0), the error "illegal combination of memory qualifiers" appears, as it apparently is impossible to have global kernels in a class. I can understand this restriction when dealing with non-static members, but why does the error still occur when the kernel is declared static?

What are the types of memory allocation in CUDA?

There are four types of memory allocation in CUDA. The memory allocated in host is by default pageable memory. The data at this memory location is usable by the host. To transfer this data to the device, the CUDA run time copies this memory to a temporary pinned memory and then transfers to the device memory.

Is there a global class member for CUDA kernels?

If you google "cuda global class member" youll find a number of treatments of this, including SO questions like here and here, your question is arguably a duplicate of those. As a simple suggestion, you could wrap your cuda kernels in host-callable class member functions, to " keep up with proper programming practice."

What is cudafree in CUDA?

cudaFree — to recover the allocated device memory There are four types of memory allocation in CUDA. The memory allocated in host is by default pageable memory. The data at this memory location is usable by the host.


Video Answer


1 Answers

My understanding is that you want to use CUDA kernels in an OOP fashion. If this was the case, the class structure below should work:

// myclass.h
class MyClass {
    public:
        void call_kernel( ... );
};

// myclass.cu
__global__
void my_kernel( ... ) {
    // do some work
}

void MyClass::call_kernel() {
    // prepare data for the kernel, e.g. allocating memory, copying from host to device, etc.

    // run kernel
    my_kernel <<< ... >>>( ... );

    // copy results from device to host, clean up, etc.
}

Please note that if you have multiple classes containing kernel code, their source code file should all use .cu extension, and you should enable separate compilation.

like image 139
yhf8377 Avatar answered Sep 25 '22 07:09

yhf8377