Efficiently transfer large file (up to 2GB) to CUDA GPU?

Question

I'm working on an a GPU accelerated program that requires the reading of an entire file of variable size. My question, what is the optimal number of bytes to read from a file and transfer to a coprocessor (CUDA device)?

These files could be as large as 2GiB, so creating a buffer of that size doesn't seem like the best idea.

Ashwin Nanjappa · Accepted Answer

You can cudaMalloc a buffer of the maximum size you can on your device. After this, copy over chunks of your input data of this size from host to device, process it, copy back the results and continue.

// Your input data on host
int hostBufNum = 5600000;
int* hostBuf   = ...;

// Assume this is largest device buffer you can allocate
int devBufNum = 1000000;
int* devBuf;

cudaMalloc( &devBuf, sizeof( int ) * devBufNum );

int* hostChunk  = hostBuf;
int hostLeft    = hostBufNum;
int chunkNum    = ( hostLeft < devBufNum ) ? hostLeft : devBufNum;

do
{
    cudaMemcpy( devBuf, hostChunk, chunkNum * sizeof( int ) , cudaMemcpyHostToDevice);
    doSomethingKernel<<< >>>( devBuf, chunkNum );

    hostChunk   = hostChunk + chunkNum;
    hostLeft    = hostBufNum - ( hostChunk - hostBuf );
} while( hostLeft > 0 );

Efficiently transfer large file (up to 2GB) to CUDA GPU?

Tags:

io

large-files

cuda

file-transfer

bandwidth

sj755

1 Answers

Ashwin Nanjappa

Recent Activity

Donate For Us

Efficiently transfer large file (up to 2GB) to CUDA GPU?

Tags:

io

large-files

cuda

file-transfer

bandwidth

sj755

1 Answers

Ashwin Nanjappa

Related questions

Recent Activity

Donate For Us