"register" keyword in CUDA

Tags:

cuda

I have a large program that uses all the registers I allocated per thread (64) and spills to local memory. I would like to be able to tell the compiler which variables should remain in registers at all cost, and which ones I don't really care about. Does the "register" C/C++ keyword work in nvcc? Is there a different mechanism perhaps?

Thanks!

873

asked Feb 18 '15 21:02

Slava P

1 Answers

You can use register in CUDA C/C++ if you want to. In any context, it is only a hint to the compiler. It may be ignored. There is no stated guarantee that it does anything at all.

I think these statements are pretty much true for most language implementations of register.

I also think it's quite likely that the compiler can do a better job than you can of deciding what should be in registers, and appropriate priority.

The typical CUDA C/C++ mechanisms for controlling register usage work at a higher level, they are:

the -maxrregcount compile switch
the launch bounds directive.

answered Oct 10 '22 22:10

Robert Crovella

Related questions
                            
                                Where does CUDA allocate the stack frame for kernels?
                            
                                Image Processing on CUDA or OpenCV?
                            
                                CUDA C programming with 2 video cards
                            
                                CUDA random number generating
                            
                                PTX "bit bucket" registers
                            
                                different kernels for different architectures
                            
                                How to read back a CUDA Texture for testing?
                            
                                How to stop Matlab crashing on (wrong) mex-file execution with CUDA functionality
                            
                                Counting occurrences of numbers in a CUDA array
                            
                                PCI-e lane allocation on 2-GPU cards?
                            
                                cudaDeviceSynchronize() error code 77: cudaErrorIllegalAddress
                            
                                Why use SIMD if we have GPGPU? [closed]
                            
                                How does CUDA Thrust compare to a raw kernel?
                            
                                memory allocation inside a CUDA kernel
                            
                                Does CUDA applications' compute capability automatically upgrade?
                            
                                OpenCV 2.4.3rc and CUDA 4.2: "OpenCV Error: No GPU support"
                            
                                Copying data to "cufftComplex" data struct?
                            
                                How to normalize matrix columns in CUDA with max performance?
                            
                                What are "Other" Issue Stall Reasons displayed by the Nsight profiler?
                            
                                Is there a CUDA smart pointer?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With