I have CUDA 4.0 installed, and a device with Compute Capability 2.0 (a GTX 460 card).
What is the difference between the 'cubin' and the 'ptx' file?
I think the cubin is a native code for the gpu so this is micro-architecture specific, and the ptx is an intermediate language that run on Fermi devices (e.g. Geforce GTX 460) via JIT compilation. When I compile a .cu
source file, I can choose between the ptx or cubin target. If I want the cubin file, I choose the code=sm_20
. But if I want a ptx file I use the code=compute_20
.
Is it correct?
A CUDA binary (also referred to as cubin) file is an ELF-formatted file which consists of CUDA executable code sections as well as other sections containing symbols, relocators, debug info, etc. By default, the CUDA compiler driver nvcc embeds cubin files into the host executable file.
NVCC is a compiler driver which works by invoking all the necessary tools and compilers like cudacc, g++, cl, etc. NVCC can output either C code (CPU Code) that must then be compiled with the rest of the application using another tool or PTX or object code directly.
nvcc is the compiler driver used to compile both . cu and . cpp files. It uses the cl.exe (on Windows) or gcc (on Linux) executable that it can find as the compiler for host code.
In order to compile CUDA code files, you have to use nvcc compiler. Cuda codes can only be compiled and executed on node that have a GPU. Heracles has 4 Nvidia Tesla P100 GPUs on node18. Cuda Compiler is installed on node 18, so you need ssh to compile cuda programs.
You have mixed up the options to select a compilation phase (-ptx
and -cubin
) with the options to control which devices to target (-code
), so you should revisit the documentation.
NVCC is the NVIDIA compiler driver. The -ptx
and -cubin
options are used to select specific phases of compilation, by default, without any phase-specific options nvcc will attempt to produce an executable from the inputs. Most people use the -c
option to cause nvcc to produce an object file which will later be linked into an executable by the default platform linker, the -ptx
and -cubin
options are only really useful if you are using the Driver API. For more information on the intermediate stages, check out the nvcc manual which is installed when you install the CUDA Toolkit.
-ptx
is a plain-text PTX file. PTX is an intermediate assembly language for NVIDIA GPUs which has not yet been fully optimised and will later be assembled to the device-specific code (different devices have different register counts for example, hence fully optimising PTX would be wrong).-cubin
is a fat binary which may contain one or more device-specific binary images as well as (optionally) PTX.The -code
argument you refer to has a different purpose entirely. I'd encourage you to check out the nvcc documentation which contains several examples, in general I would advise using the -gencode
option instead since it allows more control and allows you to target multiple devices in one binary. As a quick example:
-gencode arch=compute_xx,code=\'compute_xx,sm_yy,sm_zz\'
causes nvcc to target all devices with compute capability xx (that's the arch=
bit) and to embed PTX (code=compute_xx
) as well as device specific binaries for sm_yy and sm_zz into the final fat binary.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With