Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyCUDA: Pow within device code tries to use std::pow, fails

Question more or less says it all.

calling a host function("std::pow<int, int> ") from a __device__/__global__ function("_calc_psd") is not allowed

from my understanding, this should be using the cuda pow function instead, but it isn't.

like image 912
Bolster Avatar asked Apr 13 '11 22:04

Bolster


1 Answers

The error is exactly as the compiler is reported. You can't used host functions in device code, and that include the whole host C++ std library. CUDA includes its own standard library, described in the programming guide, but you should use either pow or fpow (taken from the C standard library, no C++ or namespaces). nvcc will overload the function with the cuda correct device function and inline the resulting code. Something like the following will work:

#include <math.h>

__device__ float func(float x) {

   return x * x * fpow(x, 0.123456f);
}

EDIT: The bit I missed the first time is the template specifier reported in the errors. Are you sure that you are passing either float or double arguments to pow? If you are passing integers, there is no overload function in the CUDA standard library, which is why it might be failing. If you need an integer pow function, you will have to roll your own (or do casting, but pow is a rather expensive function and I am certain some cascaded integer multiplication will be faster).

like image 82
talonmies Avatar answered Nov 10 '22 00:11

talonmies