Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamic Allocating memory on GPU

Tags:

cuda

Is it possible to dynamically allocate memory on a GPU's Global memory inside the Kernel?
i don't know how big will my answer be, therefore i need a way to allocate memory for each part of the answer. CUDA 4.0 alloww us to use the RAM... is it a good idea or will it reduce the speed??

like image 342
linda Avatar asked Mar 09 '11 16:03

linda


People also ask

Is dynamic memory allocation better?

Dynamic memory allocation is the process of assigning the memory space during the execution time or the run time. Reasons and Advantage of allocating memory dynamically: When we do not know how much amount of memory would be needed for the program beforehand.

Is dynamic memory allocation slow?

Dynamic memory allocation and deallocation are very slow operations when compared to automatic memory allocation and deallocation. In other words, the heap is much slower than the stack.

Is dynamic memory allocation faster?

In this memory allocation scheme, execution is faster than dynamic memory allocation. In this memory allocation scheme, execution is slower than static memory allocation. In this memory is allocated at compile time.

Why is dynamic allocation slow?

So one of the reasons why allocators are slow is that the allocation algorithm needs some time to find an available block of a given size. But that is not all. As time progresses it gets harder and harder to find the block of the appropriate size. The reason for this is called memory fragmentation.


2 Answers

it is possible to use malloc inside a kernel. check the following which is taken from nvidia cuda guide:

__global__ void mallocTest() 
{ 
  char* ptr = (char*)malloc(123); 
  printf(“Thread %d got pointer: %p\n”, threadIdx.x, ptr); 
  free(ptr); 
} 
void main() 
{ 
  cudaThreadSetLimit(cudaLimitMallocHeapSize, 128*1024*1024); 
  mallocTest<<<1, 5>>>(); 
  cudaThreadSynchronize(); 
} 

will output: 
Thread 0 got pointer: 00057020 
Thread 1 got pointer: 0005708c 
Thread 2 got pointer: 000570f8 
Thread 3 got pointer: 00057164 
like image 72
scatman Avatar answered Oct 14 '22 01:10

scatman


From CUDA 4.0 you will be able to use new and delete operators from c++ instead of malloc and free from c.

like image 42
kokosing Avatar answered Oct 14 '22 01:10

kokosing