Using Eigen 3.3 in a CUDA kernel

Tags:

Since Nov. 2016 it's possible to compile CUDA code which references Eigen3.3 - see this answer

This answer is not what I'm looking for and may now be "outdated" in the sense that there might now be an easier way, since the following is written in the docs

Starting from Eigen 3.3, it is now possible to use Eigen's objects and algorithms within CUDA kernels. However, only a subset of features are supported to make sure that no dynamic allocation is triggered within a CUDA kernel.

See also here. Unfortunately, I was not able to find any example of how this might look like.

My Question

Is it now possible to write a kernel such as the following, which should simply calculate a bunch of dot products?

__global__ void cu_dot(Eigen::Vector3d *v1, Eigen::Vector3d *v2, double *out, size_t N)
{
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if(idx < N)
    {
        out[idx] = v1[idx].dot(v2[idx]);
    }
    return;
}

I can compile this, but it does not seem to work. When I try to copy the data to host, I get illegal memory access. Note that I originally store the Vector3d's as `std::vector and then respectively use

cudaMalloc((void **)&p_device_v1, sizeof(Eigen::Vector3d)*n);
cudaMemcpy(p_v1_device, v1.data(), sizeof(Eigen::Vector3d)*n, cudaMemcpyHostToDevice);

I have set up an MWE project using CMake at https://github.com/GPMueller/eigen-cuda

565

asked Dec 13 '16 10:12

GPMueller

1 Answers

In the MWE project on github, you wrote:

double dot(std::vector<Eigen::Vector3d> v1, std::vector<Eigen::Vector3d> v2)
{   
    ...     
    // Dot product
    cu_dot<<<(n+1023)/1024, 1024>>>(v1.data(), v2.data(), dev_ret, n);

The v1.data() and v2.data() pointers are in the CPU memory. You need to use the pointers in the GPU memory, i.e.

// Dot product
cu_dot<<<(n+1023)/1024, 1024>>>(dev_v1, dev_v2, dev_ret, n);

The CPU vs GPU results are not identical, but that's an issue with the code, i.e. you didn't perform a reduction on the multiple dot products.

answered Oct 23 '22 01:10

Avi Ginsburg

Related questions
                            
                                Compile times with boost spirit x3
                            
                                Fast display of waveform in C/C++
                            
                                How to use variable length parameter list with only one reference parameter?
                            
                                Qt nested ListView or can I use TreeView
                            
                                efficient way to insert a unique_ptr into a map without deleting the pointer if the key already exists [duplicate]
                            
                                Header file not found in Eclipse CDT
                            
                                In C/C++, what's the minimum type up-casting required for mixed-type integer math?
                            
                                Is the documentation for WinDbg SRV* wrong?
                            
                                Implicit destructor execution in function calling
                            
                                Is there any possibility to reach the index of an outer QML Repeater from the inner one (they are nested)?
                            
                                Lifetime extension of temporaries' data members and API design
                            
                                Final enum classes in C++11
                            
                                inheritance and attribute packed
                            
                                Why is an STL deque not implemented as just a circular vector?
                            
                                Difference between `:/foo`, `qrc:/foo` and `qrc:///foo` paths in Qt
                            
                                Interleaving EXPECT_CALL()s and calls to the mock functions
                            
                                How to include character literals?
                            
                                How do I tell Android Studio to use GCC 4.9 with CMake?
                            
                                Exception not caught opening a non-existing file using C++
                            
                                Is steady_clock monotonic across threads?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using Eigen 3.3 in a CUDA kernel

Tags:

c++

c++11

cuda

eigen3

My Question

GPMueller

People also ask

1 Answers

Avi Ginsburg

Recent Activity

Donate For Us