The following simple program never exits if the cudaMalloc call is executed. Commenting out just the cudaMalloc causes it to exit normally.
#include <iostream>
using std::cout;
using std::cin;
#include "cuda.h"
#include "cutil_inline.h"
void PrintCudaVersion(int version, const char *name)
{
int versionMaj = version / 1000;
int versionMin = (version - (versionMaj * 1000)) / 10;
cout << "CUDA " << name << " version: " << versionMaj << "." << versionMin << "\n";
}
void ReportCudaVersions()
{
int version = 0;
cudaDriverGetVersion(&version);
PrintCudaVersion(version, "Driver");
cudaRuntimeGetVersion(&version);
PrintCudaVersion(version, "Runtime");
}
int main(int argc, char **argv)
{
//CUresult r = cuInit(0); << These two lines were in original post
//cout << "Init result: " << r << "\n"; << but have no effect on the problem
ReportCudaVersions();
void *ptr = NULL;
cudaError_t err = cudaSuccess;
err = cudaMalloc(&ptr, 1024*1024);
cout << "cudaMalloc returned: " << err << " ptr: " << ptr << "\n";
err = cudaFree(ptr);
cout << "cudaFree returned: " << err << "\n";
return(0);
}
This is running on Windows 7, CUDA 4.1 driver, CUDA 3.2 runtime. I've trace the return from main through the CRT to ExitProcess(), from which it never returns (as expected) but the process never ends either. From VS2008 I can stop debugging OK. From the command line, I must kill the console window.
Program output:
Init result: 0
CUDA Driver version: 4.1
CUDA Runtime version: 3.2
cudaMalloc returned: 0 ptr: 00210000
cudaFree returned: 0
I tried making the allocation amount so large that cudaMalloc would fail. It did and reported an error, but the program still would not exit. So it apparently has to do with merely calling cudaMalloc, not the existence of allocated memory.
Any ideas as to what is going on here?
EDIT: I was wrong in the second sentence - I have to eliminate both the cudaMalloc and the cudaFree to get the program to exit. Leaving either one in causes the hang up.
EDIT: Although there are many references to the fact that CUDA driver versions are backward compatible, this problem went away when I reverted the driver to V3.2.
It seems like you're mixing the driver API (cuInit
) with the runtime API (cudaMalloc
).
I don't know if anything funny happens (or should happen) behind the scenes, but one thing you could try is to remove the cuInit
and see what happens.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With