Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Options for GPU computing in Julia

I am considering buying a GPU card to experiment with GPU computing in Julia. As I see it now there are basically two options: NVIDIA or AMD chipsets.

My question is: is there a recommended option for use with Julia? As I am new to GPU computing, my focus is more on ease of use than on performance, so I can imagine the current Julia packages that serve as GPU interfaces basically determine the answer.

I use a Windows 7 based system. Any help is appreciated.

like image 534
InkPen Avatar asked Dec 25 '22 04:12

InkPen


1 Answers

A few points:

1) ArrayFire is a pretty easy to use GPU platform with a Julia interface (https://github.com/JuliaGPU/ArrayFire.jl). It works with both NVIDIA and AMD GPUs.

2) If you want things that go beyond what is in ArrayFire, then there is generally more support for NVIDIA cards through the CUDA C language that is proprietary to NVIDIA. You can see a list of all GPU packages for Julia here. As you'll see, many more of them are for CUDA than for OpenCL, which is the C version that works for writing kernels that work on either NVIDIA or AMD. But, know that if you go this route, you'll need to start writing your own kernels in C.

In my opinion, CUDA C has some convenient automation features that automatically will handle certain aspects of distributing work amongst the cores in an efficient way. CUDA C certainly seems to be more prevalently used in scientific computing.

But, I don't think there's much that can't be done in open CL and it's probably not too much more difficult to learn how to do things with that. Furthermore, OpenCL also has the advantage of being applicable to a wide variety of high performance platforms beyond GPUs (e.g. programming on Intel's Xeon Phi).

3) You should pay careful attention to whether you need to be working in single or in double precision for floating point operations. This makes a big difference when choosing a GPU from either manufacturer. For instance, NVIDIA has some GPUs that are specially designed to do double precision operations (mainly the Tesla line, but also the Titan Black). If you choose an NVIDIA GPU other than this, you'll get 1/32 the performance for double precision as you get for single. AMD chips tend to be a bit less specialized, performing more comparably between single and double precision. I presume that there are some use cases where NVIDIA cards will be a better value and others where AMD will be more cost effective.

4) GPUs can get quite pricey (though there are often pretty good used options available on Ebay, etc.). Their joy is that they can do (certain) computations in times that are orders of magnitude faster than CPUs. But, to get this advantage, you're often going to be spending thousands of dollars minimum (in particular if you need to buy a new system to support a powerful GPU, since many basic consumer-grade computers just don't support them that well). If at all possible, it will be really to your advantage to do some trial work first to figure out exactly what you will need. For instance, NVIDIA has a test program that you can apply to here. I've never used it, so I can't say much one way or the other. AMD probably has something similar. Alternatively, if you're affiliated with a company or research institution that has GPUs available, or if you have a friend who will let you ssh to their computer and try them out, it could be very helpful in figuring out what you need ahead of time.

5) When looking at different cards, you'll want to pay careful attention not only to how many flops per dollar they deliver (in your desired precision level) but also things like how much GPU ram you'll need, and potentially issues of how efficiently they support communication between multiple GPUs and between the GPU and CPU. As far as I know, the gold standard for these GPU-GPU and CPU-CPU calculations is the new NVIDIA P100 card. That is super, super expensive though and right now only available as a part of a $100k + system (that has 8 of them) bought from NVIDIA. Towards the end of the year, the P100's should be available from other manufacturers. They can do incredible things in terms of speed of transfer between CPU and GPU, but there's a hefty price to pay for that, and they won't justify the price if all you're looking for is simply the most flops per dollar.

like image 188
Michael Ohlrogge Avatar answered Dec 29 '22 10:12

Michael Ohlrogge