Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Theano with GPU on Ubuntu 14.04 on AWS g2

I'm having trouble getting Theano to use the GPU on my machine.

When I run: /usr/local/lib/python2.7/dist-packages/theano/misc$ THEANO_FLAGS=floatX=float32,device=gpu python check_blas.py WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected)

I've also checked that the NVIDIA driver is installed with: lspci -vnn | grep -i VGA -A 12

with result: Kernel driver in use: nvidia

However, when I run: nvidia-smi result: NVIDIA: could not open the device file /dev/nvidiactl (No such file or directory). NVIDIA-SMI has failed because it couldn't communicate with NVIDIA driver. Make sure that latest NVIDIA driver is installed and running.

and /dev/nvidiaactl doesn't exist. What's going on?

UPDATE: /nvidia-smi works with result:

+------------------------------------------------------+
| NVIDIA-SMI 4.304...   Driver Version: 304.116        |
|-------------------------------+----------------------+----------------------+
| GPU  Name                     | Bus-Id        Disp.  | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage         | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GRID K520                | 0000:00:03.0     N/A |                  N/A |
| N/A   39C  N/A     N/A /  N/A |   0%   10MB / 4095MB |     N/A      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+

and after compiling the NVIDIA_CUDA-6.0_Samples then running deviceQuery I get result:

cudaGetDeviceCount returned 35 -> CUDA driver version is insufficient for CUDA runtime version Result = FAIL

like image 287
user3822367 Avatar asked Mar 19 '23 02:03

user3822367


2 Answers

CUDA GPUs in a linux system are not usable until certain "device files" have been properly established.

There is a note to this effect in the documentation.

In general there are several ways these device files can be established:

  1. If an X-server is running.
  2. If a GPU activity is initiated as root user (such as running nvidia-smi, or any CUDA app.)
  3. Via startup scripts (refer to the documentation linked above for an example).

If none of these steps are taken, the GPUs will not be functional for non-root users. Note that the files do not persist through re-boots, and must be re-established on each boot cycle, through one of the 3 above methods. If you use method 2, and reboot, the GPUs will not be available until you use method 2 again.

I suggest reading the linux getting started guide entirely (linked above), if you are having trouble setting up a linux system for CUDA GPU usage.

like image 92
Robert Crovella Avatar answered Mar 24 '23 18:03

Robert Crovella


If you are using CUDA 7.5, make sure follow official instruction: CUDA 7.5 doesn't support the default g++ version. Install an supported version and make it the default.

sudo apt-get install g++-4.9

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10

sudo update-alternatives --install /usr/bin/cc cc /usr/bin/gcc 30
sudo update-alternatives --set cc /usr/bin/gcc

sudo update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++ 30
sudo update-alternatives --set c++ /usr/bin/g++

If theano GPU test code has error:

ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcublas.so.7.5: cannot open shared object file: No such file or directory WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: cuda unavilable)

Just using ldconfig command to link the shared object of cuda 7.5:

sudo ldconfig /usr/local/cuda-7.5/lib64
like image 40
maroon912 Avatar answered Mar 24 '23 17:03

maroon912