Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently transfer large file (up to 2GB) to CUDA GPU?

I'm working on an a GPU accelerated program that requires the reading of an entire file of variable size. My question, what is the optimal number of bytes to read from a file and transfer to a coprocessor (CUDA device)?

These files could be as large as 2GiB, so creating a buffer of that size doesn't seem like the best idea.

like image 345
sj755 Avatar asked Mar 16 '12 03:03

sj755


1 Answers

You can cudaMalloc a buffer of the maximum size you can on your device. After this, copy over chunks of your input data of this size from host to device, process it, copy back the results and continue.

// Your input data on host
int hostBufNum = 5600000;
int* hostBuf   = ...;

// Assume this is largest device buffer you can allocate
int devBufNum = 1000000;
int* devBuf;

cudaMalloc( &devBuf, sizeof( int ) * devBufNum );

int* hostChunk  = hostBuf;
int hostLeft    = hostBufNum;
int chunkNum    = ( hostLeft < devBufNum ) ? hostLeft : devBufNum;

do
{
    cudaMemcpy( devBuf, hostChunk, chunkNum * sizeof( int ) , cudaMemcpyHostToDevice);
    doSomethingKernel<<< >>>( devBuf, chunkNum );

    hostChunk   = hostChunk + chunkNum;
    hostLeft    = hostBufNum - ( hostChunk - hostBuf );
} while( hostLeft > 0 );    
like image 145
Ashwin Nanjappa Avatar answered Oct 02 '22 05:10

Ashwin Nanjappa