NVIDIA vs AMD: GPGPU performance

Tags:

I'd like to hear from people with experience of coding for both. Myself, I only have experience with NVIDIA.

NVIDIA CUDA seems to be a lot more popular than the competition. (Just counting question tags on this forum, 'cuda' outperforms 'opencl' 3:1, and 'nvidia' outperforms 'ati' 15:1, and there's no tag for 'ati-stream' at all).

On the other hand, according to Wikipedia, ATI/AMD cards should have a lot more potential, especially per dollar. The fastest NVIDIA card on the market as of today, GeForce 580 ($500), is rated at 1.6 single-precision TFlops. AMD Radeon 6970 can be had for $370 and it is rated at 2.7 TFlops. The 580 has 512 execution units at 772 MHz. The 6970 has 1536 execution units at 880 MHz.

How realistic is that paper advantage of AMD over NVIDIA, and is it likely to be realized in most GPGPU tasks? What happens with integer tasks?

557

asked Jan 09 '11 08:01

Eugene Smith

2 Answers

Metaphorically speaking ati has a good engine compared to nvidia. But nvidia has a better car :D

This is mostly because nvidia has invested good amount of its resources (in money and people) to develop important libraries required for scientific computing (BLAS, FFT), and then a good job again in promoting it. This may be the reason CUDA dominates the tags over here compared to ati (or OpenCL)

As for the advantage being realized in GPGPU tasks in general, it would end up depending on other issues (depending on the application) such as, memory transfer bandwidth, a good compiler and probably even the driver. nvidia having a more mature compiler, a more stable driver on linux (linux because, its use is widespread in scientific computing), tilt the balance in favor of CUDA (at least for now).

EDIT Jan 12, 2013

It's been two years since I made this post and it still seems to attract views sometimes. So I have decided to clarify a few things

AMD has stepped up their game. They now have both BLAS and FFT libraries. Numerous third party libraries are also cropping up around OpenCL.
Intel has introduced Xeon Phi into the wild supporting both OpenMP and OpenCL. It also has the ability use existing x86 code. as noted in the comments, limited x86 without SSE for now
NVIDIA and CUDA still have the edge in the range of libraries available. However they may not be focusing on OpenCL as much as they did before.

In short OpenCL has closed the gap in the past two years. There are new players in the field. But CUDA is still a bit ahead of the pack.

179

answered Oct 16 '22 02:10

Pavan Yalamanchili

I don't have any strong feelings about CUDA vs. OpenCL; presumably OpenCL is the long-term future, just by dint of being an open standard.

But current-day NVIDIA vs ATI cards for GPGPU (not graphics performance, but GPGPU), that I do have a strong opinion about. And to lead into that, I'll point out that on the current Top 500 list of big clusters, NVIDIA leads AMD 4 systems to 1, and on gpgpu.org, search results (papers, links to online resources, etc) for NVIDIA outnumber results for AMD 6:1.

A huge part of this difference is the amount of online information available. Check out the NVIDIA CUDA Zone versus AMD's GPGPU Developer Central. The amount of stuff there for developers starting up doesn't even come close to comparing. On NVIDIAs site you'll find tonnes of papers - and contributed code - from people probably working on problems like yours. You'll find tonnes of online classes, from NVIDIA and elsewhere, and very useful documents like the developers' best practice guide, etc. The availability of free devel tools - the profiler, the cuda-gdb, etc - overwhelmingly tilts NVIDIAs way.

(Editor: the information in this paragraph is no longer accurate.) And some of the difference is also hardware. AMDs cards have better specs in terms of peak flops, but to be able to get a significant fraction of that, you have to not only break your problem up onto many completely independent stream processors, each work item also needs to be vectorized. Given that GPGPUing ones code is hard enough, that extra architectural complexity is enough to make or break some projects.

And the result of all of this is that the NVIDIA user community continues to grow. Of the three or four groups I know thinking of building GPU clusters, none of them are seriously considering AMD cards. And that will mean still more groups writing papers, contributing code, etc on the NVIDIA side.

I'm not an NVIDIA shill; I wish it weren't this way, and that there were two (or more!) equally compelling GPGPU platforms. Competition is good. Maybe AMD will step up its game very soon - and the upcoming fusion products look very compelling. But in giving someone advice about which cards to buy today, and where to spend their time putting effort in right now, I can't in good conscience say that both development environments are equally good.

Edited to add: I guess the above is a little elliptical in terms of answering the original question, so let me make it a bit more explicit. The performance you can get from a piece of hardware is, in an ideal world with infinite time available, dependent only on the underlying hardware and the capabilities of the programming language; but in reality, the amount of performance you can get in a fixed amount of time invested is also strongly dependant on devel tools, existing community code bases (eg, publicly available libraries, etc). Those considerations all point strongly to NVIDIA.

(Editor: the information in this paragraph is no longer accurate.) In terms of hardware, the requirement for vectorization within SIMD units in the AMD cards also make achieving paper performance even harder than with NVIDIA hardware.

answered Oct 16 '22 03:10

Jonathan Dursi

Related questions
                            
                                CUDA or FPGA for special purpose 3D graphics computations? [closed]
                            
                                Does CUDA support recursion?
                            
                                Coding CUDA with C#?
                            
                                CUDA determining threads per block, blocks per grid
                            
                                Error Message : Cannot find or open the PDB file
                            
                                How can I flush GPU memory using CUDA (physical reset is unavailable)
                            
                                GPU Programming, CUDA or OpenCL? [closed]
                            
                                When to call cudaDeviceSynchronize?
                            
                                Passing pointers between C and Java through JNI
                            
                                LNK2038: mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MD_DynamicRelease' in file.obj
                            
                                In CUDA, what is memory coalescing, and how is it achieved?
                            
                                nvidia-smi Volatile GPU-Utilization explanation?
                            
                                Streaming multiprocessors, Blocks and Threads (CUDA)
                            
                                Why is CUDA pinned memory so fast?
                            
                                Is it possible to run CUDA on AMD GPUs?
                            
                                Best approach for GPGPU/CUDA/OpenCL in Java?
                            
                                How do I select which GPU to run a job on?
                            
                                Can I run CUDA on Intel's integrated graphics processor?
                            
                                How to get the nvidia driver version from the command line?
                            
                                What is a bank conflict? (Doing Cuda/OpenCL programming)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

NVIDIA vs AMD: GPGPU performance

Tags:

cuda

gpgpu

nvidia

opencl

ati

Eugene Smith

People also ask

2 Answers

Pavan Yalamanchili

Jonathan Dursi

Recent Activity

Donate For Us