What is the difference between cuda vs tensor cores?

Tags:

I am completely new to terms related to HPC computing, but I just saw that EC2 released its new type of instance on AWS that's powered by the new Nvidia Tesla V100, which has both kinds of "cores": Cuda Cores (5,120) and Tensor Cores (640). What is the difference between both?

563

asked Nov 16 '17 16:11

Simon Ernesto Cardenas Zarate

2 Answers

Each tensor core perform operations on small matrices with size 4x4. Each tensor core can perform 1 matrix multiply-accumulate operation per 1 GPU clock. It multiplies two fp16 matrices 4x4 and adds the multiplication product fp32 matrix (size: 4x4) to accumulator (that is also fp32 4x4 matrix).

It is called mixed precision because input matrices are fp16 but multiplication result and accumulator are fp32 matrices.

Probably, the proper name would be just 4x4 matrix cores however NVIDIA marketing team decided to use "tensor cores".

139

answered Sep 19 '22 11:09

Artur

GPU’s have always been good for machine learning. GPU cores were originally designed for physics and graphics computation, which involves matrix operations. General computing tasks do not require lots of matrix operations, so CPU’s are much slower at these. Physics and graphics are also far easier to parallelise than general computing tasks, leading to the high core count.

Due to the matrix heavy nature of machine learning (neural nets), GPU’s were a great fit. Tensor cores are just more heavily specialised to the types of computation involved in machine learning software (such as Tensorflow).

Nvidia have written a detailed blog here, which goes into far more detail on how Tensor cores work and the preformance improvements over CUDA cores.

answered Sep 20 '22 11:09

MikeS159

Related questions
                            
                                allocating shared memory
                            
                                CUDA: How to use -arch and -code and SM vs COMPUTE
                            
                                Can I use __syncthreads() after having dropped threads?
                            
                                Running more than one CUDA applications on one GPU
                            
                                How to install CUDA in Google Colab GPU's
                            
                                High level GPU programming in C++ [closed]
                            
                                Can/Should I run this code of a statistical application on a GPU?
                            
                                Using std::vector in CUDA device code
                            
                                CUDA Driver API vs. CUDA runtime
                            
                                Why does cudaMalloc() use pointer to pointer?
                            
                                ImportError: libcublas.so.9.0: cannot open shared object file
                            
                                CUDA: How many concurrent threads in total?
                            
                                Fortran vs C++, does Fortran still hold any advantage in numerical analysis these days? [closed]
                            
                                How to let cmake find CUDA
                            
                                CUDA model - what is warp size?
                            
                                Where did CUDA get installed in my computer?
                            
                                Use of cudamalloc(). Why the double pointer?
                            
                                How can I compile CUDA code then link it to a C++ project?
                            
                                Structure of Arrays vs Array of Structures
                            
                                Python GPU programming [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between cuda vs tensor cores?

Tags:

cuda

gpu

nvidia

Simon Ernesto Cardenas Zarate

People also ask

2 Answers

Artur

MikeS159

Recent Activity

Donate For Us