Is it fair to compare SSE/AVX units to GPU cores?

Tags:

I have a presentation to make to people who have (almost) no clue of how a GPU works. I think saying that a GPU has a thousand cores where a CPU only has four to eight of them is a non-sense. But I want to give my audience an element of comparison.

After a few months working with NVidia's Kepler and AMD's GCN architectures, I'm tempted to compare a GPU "core" to a CPU's SIMD ALU (I don't know if they have a name for that at Intel). Is it fair ? After all, when looking at an assembly level, those programming models have much in common (at least with GCN, take a look at p2-6 of the ISA manual).

This article states that an Haswell processor can do 32 single-precision operations per cycle, but I suppose there is pipelining or other things happening to achieve that rate. In NVidia parlance, how many Cuda-cores does this processor have ? I would say 8 per CPU-core for 32 bits operations, but this is just a guess based on the SIMD width.

Of course there is many other things to take into account when comparing CPU and GPU hardware, but this is not what I'm trying to do. I just have to explain how the thing is working.

PS: All pointers to CPU hardware documentations or CPU/GPU presentations are greatly appreciated !

EDIT: Thanks for your answers, sadly I had to chose only one of them. I marked Igor's answer because it sticks the most to my initial question and gave me enough informations to justify why this comparison shouldn't be taken too far, but CaptainObvious provided very good articles.

883

asked Jul 02 '13 13:07

Simon

1 Answers

I'd be very caution on making this kind of comparison. After all even in the GPU world the term "core" depending on the context has really different capability: the new AMD GCN is quite different from the old VLIW4 one which itself is quite different from the CUDA core one.
Besides that, you will bring more puzzlement than understanding to your audience if you make just one small comparison with CPU and that's it. If I were you I'd still go for a more detailed (can still be quick) comparison.
For instance someone used to CPU and with little knowledge of GPU, might wonder how come a GPU can have so many registers though it's so expensive (in the CPU world). An explanation to that question is given at the end of this post as well as some more comparison GPU vs CPU.

This other article gives a nice comparison between these two kind of processing units by explaining how GPUs work but also how they evolved and showing the differences with CPUs. It addresses topics like data flow, memory hierarchy but also for what kind of applications a GPU is useful. After all the power a GPU can developed is accessible (efficiently) only for some types of problems.
And personally, If I had to make a presentation about GPU and had the possibility to make only one reference to CPU it would be this: presenting the problems a GPU can solve efficiently vs those a CPU can handle better.
As a bonus even though it's not related directly to your presentation here is an article that put GPGPU in perspective, showing that some speedup claimed by some people are overrated (this is linked to my last point btw :))

142

answered Oct 25 '22 00:10

CaptainObvious

Related questions
                            
                                Building CUDA object files using cmake
                            
                                EmguCV - nvcuda.dll could not be found
                            
                                How are 2D / 3D CUDA blocks divided into warps?
                            
                                Why bother to know about CUDA Warps?
                            
                                CUDA compilation issue with CMake
                            
                                cuda memory alignment
                            
                                Installing CUDA Windows 10
                            
                                Cuda Shared Memory array variable
                            
                                Why does CUDA code run so much faster in NVIDIA Visual Profiler?
                            
                                What's the relation between nvidia driver, cuda driver and cuda toolkit?
                            
                                Efficient layout and reduction of virtual 2d data (abstract)
                            
                                Could a CUDA kernel call a cublas function?
                            
                                CUDA: Wrapping device memory allocation in C++
                            
                                CUDA atomicAdd for doubles definition error
                            
                                What is a CUDA context?
                            
                                ArrayFire versus raw CUDA programming?
                            
                                CUDA new delete
                            
                                Python Multiprocessing with PyCUDA
                            
                                "Unrolling" a recursive function?
                            
                                Passing structs to CUDA kernels

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it fair to compare SSE/AVX units to GPU cores?

Tags:

cuda

gpu

hardware

sse

opencl

Simon

People also ask

1 Answers

CaptainObvious

Recent Activity

Donate For Us