Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Miscellaneous and Inter-Thread Communication Instructions in CUDA

I've been playing around with the NVIDIA profiler (nvprof) and there are two particular metrics which I do not understand:

inst_inter_thread_communication
    Number of inter-thread communication instructions executed by non-predicated threads
inst_misc
    Number of miscellaneous instructions executed by non-predicated threads

I'm just wondering what instructions would be inter-thread communication instructions and which instructions would fall under miscellaneous.

Reference: http://docs.nvidia.com/cuda/profiler-users-guide/#metrics-reference

like image 594
squirem Avatar asked Sep 04 '14 16:09

squirem


1 Answers

The SASS instructions that fall into the two categories are as follows:

inst_inter_thread_communication

  • SHFL
  • VOTE

inst_misc

  • NOP
  • S2R, B2R, R2B, P2R
  • LEPC
  • CSET[P], PSET[P]
  • MOV
  • SEL
  • PRMT
  • Maxwell Only (BAR, DEPBAR)
  • There are several infrequent undocumented instructions that increment this category.

The document CUDA Binary Utilities section Instruction Set Reference contains a brief description of the SASS instructions. There is close to a 1:1 relationship between SASS and PTX so you can also review the PTX ISA manual.

like image 120
Greg Smith Avatar answered Sep 20 '22 14:09

Greg Smith