I would like to set up AMD Radeon for Deep Learning on Ubuntu. The main libraries for my work are keras and pytorch. I followed strictly on ROCm installation guideline here but failed at the 3rd step with the command sudo apt install rocm-dkms
. Error messages were shown as follows.
Setting up dkms (2.8.1-5ubuntu1) ...
Setting up hip-rocclr (4.0.20496.5685.40000-23) ...
Setting up rock-dkms (1:4.0-23) ...
Loading new amdgpu-4.0-23 DKMS files...
Building for 5.8.0-41-generic
Building for architecture x86_64
Building initial module for 5.8.0-41-generic
Error! Bad return status for module build on kernel: 5.8.0-41-generic (x86_64)
Consult /var/lib/dkms/amdgpu/4.0-23/build/make.log for more information.
dpkg: error processing package rock-dkms (--configure):
installed rock-dkms package post-installation script subprocess returned error
exit status 10
Setting up g++-9 (9.3.0-17ubuntu1~20.04) ...
Setting up g++ (4:9.3.0-1ubuntu2) ...
update-alternatives: using /usr/bin/g++ to provide /usr/bin/c++ (c++) in auto mo
de
Setting up build-essential (12.8ubuntu1.1) ...
dpkg: dependency problems prevent configuration of rocm-dkms:
rocm-dkms depends on rock-dkms; however:
Package rock-dkms is not configured yet.
dpkg: error processing package rocm-dkms (--configure):
dependency problems - leaving unconfigured
Setting up gcc-multilib (4:9.3.0-1ubuntu2) ...
No apport report written because the error message indicates its a followup erro
r from a previous failure.
Setting up g++-9-multilib (9.3.0-17ubuntu1~20.04) ...
Setting up g++-multilib (4:9.3.0-1ubuntu2) ...
Processing triggers for sgml-base (1.29.1) ...
Setting up x11proto-dev (2019.2-1ubuntu1) ...
Setting up libxau-dev:amd64 (1:1.0.9-0ubuntu1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.2) ...
Processing triggers for man-db (2.9.1-1) ...
Setting up libxdmcp-dev:amd64 (1:1.1.3-0ubuntu1) ...
Setting up x11proto-core-dev (2019.2-1ubuntu1) ...
Setting up libxcb1-dev:amd64 (1.14-2) ...
Setting up libx11-dev:amd64 (2:1.6.9-2ubuntu1.1) ...
Setting up libglx-dev:amd64 (1.3.2-1~ubuntu0.20.04.1) ...
Setting up libgl-dev:amd64 (1.3.2-1~ubuntu0.20.04.1) ...
Setting up mesa-common-dev:amd64 (20.2.6-0ubuntu0.20.04.1) ...
Setting up rocm-opencl-dev (3.6Beta-17-g875c1f8-rocm-rel-4.0-23) ...
Settin XT g up rocm-clang-ocl (0.5.0.64-rocm-rel-4.0-23-50fb51a) ...
Setting up rocm-utils (4.0.0.40000-23) ...
Setting up rocm-dev (4.0.0.40000-23) ...
Processing triggers for libc-bin (2.31-0ubuntu9.2) ...
Errors were encountered while processing:
rock-dkms
rocm-dkms
E: Sub-process /usr/bin/dpkg returned an error code (1)
My kernel version is 5.8.0-41-generic
. My VGA card is Gigabyte Radeon RX6900 XT. My CPU is AMD Ryzen 9 3900 XT. I tried several solutions suggested in previous posts but it did not solve my problem. May I have your suggestions to fix this.
I've been having the same issue as well. The only way I found to fix it is to roll back to the 5.6.0-1042-oem kernel. The AMD drivers don't seem to support any kernel past this one.
Edit: This is also a way to get the amdgpupro drivers to install without a problem.
WARNING: I'm writing all this after the fact and i might have missed a step or something along the way. Please be very careful especially with trying to remove kernels and when working in your boot directory. If you're uncomfortable with the idea of wrecking your system you can always set grub's default selection which is a lot safer than removing an initramfs.
Here's how I got RocM working
sudo apt install linux-image-5.6.0-1042-oem linux-headers-5.6.0-1042-oem && reboot
Make sure you boot into the 5.6 kernel by accessing the Ubuntu advanced options in grub.
sudo apt remove linux-image-5.8.0-41-generic linux-headers-5.8.0-41-generic && sudo apt autoremove && reboot
Again you'll have to reboot into 5.6 through the advanced options. (Hold the shift key after BIOS finishes loading to get the Ubuntu Advanced Options menu.) After you're back in it's a good idea to set your headers and image as held back because a kernel update will most likely break RocM.
sudo apt-mark hold linux-image-generic linux-headers-generic
Now we're going to try and flush out the 5.8 kernel. Start by flushing out the temporary files.
sudo rm -rv ${TMPDIR:-/var/tmp}/mkinitramfs-*
Now list all of the kernels installed.
dpkg -l | tail -n +6 | grep -E 'linux-image-[0-9]+'
And try to remove the 5.8 kernel. Do this for any kernel you have above the 5.6 one we installed.
sudo update-initramfs -d -k 5.8.0-41-generic
Now the initramfs, Systemmap, and config are still present in the boot dir so we need to clear those out to get grub working properly again.
cd /boot/
sudo rm vmlinuz-5.8.0-41-generic System.map-5.8.0-41-generic config-5.8.0-41-generic
Now you should be finally ready to update grub
sudo update-grub && reboot
Now when you load back in you should be able to install RocM
sudo apt install rocm-dkms
As per the official notes in this link, AMD ROCm platform is designed to support Ubuntu 20.04.1 (5.4 and 5.6-oem) and 18.04.5 (Kernel 5.4).
So kernel version 5.8 is not supported. However, downgrading is an option but instead of rushing to that, you can simply boot into an older version of kernel.
Try following steps:
advanced options for ubuntu
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With