How to measure the inner kernel time in NVIDIA CUDA?

Tags:

I want to measure time inner kernel of GPU, how how to measure it in NVIDIA CUDA? e.g.

__global__ void kernelSample() {   some code here   get start time    some code here    get stop time    some code here }

723

asked May 14 '12 15:05

Amin

1 Answers

You can do something like this:

__global__ void kernelSample(int *runtime) {   // ....   clock_t start_time = clock();    //some code here    clock_t stop_time = clock();   // ....    runtime[tidx] = (int)(stop_time - start_time); }

Which gives the number of clock cycles between the two calls. Be a little careful though, the timer will overflow after a couple of seconds, so you should be sure that the duration of code between successive calls is quite short. You should also be aware that the compiler and assembler do perform instruction re-ordering so you might want to check that the clock calls don't wind up getting put next to each other in the SASS output (use cudaobjdump to check).

answered Sep 20 '22 18:09

talonmies

Related questions
                            
                                ProGuard error can't find superclass or interface org.apache.http.entity
                            
                                Ember.js + HTML5 drag and drop shopping cart demo
                            
                                gnuplot plotting multiple line graphs
                            
                                Scala: Expand List of Tuples into variable-length argument list of Tuples
                            
                                How do I export a specific commit with git-archive?
                            
                                Transfer Python setup across different PC
                            
                                Progress bar for AVAssetExportSession
                            
                                Deleting a specific element from a nested hash
                            
                                only show the first item in list using Mustache
                            
                                how to manage memory with texture in opengl?
                            
                                ListView with customized Row Layout - Android
                            
                                Using Auto and Lambda to handle Signal?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With