Docker supports user namespace remapping, so that the user namespace is completely separated from the host.
The current default behavior ensures that containers get their own user and group management, i.e. their own version of /etc/passwd
and /etc/group
, but container processes are run under the same identical UIDs on the host system. This means if your container runs with UID 1 (root), it will also run as root on the host. By the same token, if your container has user "john" with UID 1001 installed and starts its main process with that user, on the host it will also run with UID 1001, which might belong to user "Will" and could also have admin rights.
To make user namespace isolation complete, one needs to enable remapping, which maps the UIDs in the container to different UIDs on the host. So, UID 1 on the container would be mapped to a "non-privileged" UID on the host.
Is there any support in Kubernetes for this feature to be enabled on the underlying Container Runtime? Will it work out of the box without issues?
Docker makes use of kernel namespaces to provide the isolated workspace called the container . When you run a container, Docker creates a set of namespaces for that container. These namespaces provide a layer of isolation.
User Namespaces is officially added to Docker ver. 1.10, which allows the host system to map its own uid and gid to some different uid and gid for containers' processes. This is a big improvement in Docker's security.
Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources and another set of processes sees a different set of resources. Thus Docker uses namespaces to provide this isolation to the containers from the host.
Running Commands as a Different User in a Docker Container To run a command as a different user inside your container, add the --user flag: docker exec --user guest container-name whoami.
So, it's not supported yet like Docker as per this (as alluded in the comments) and this.
However, if you are looking at isolating your workloads there are other alternatives (it's not the same, but the options are pretty good):
You can use Pod Security Policies and specifically you can use RunAsUser, together with AllowPrivilegeEscalation=false. Pod Security Policies can be tied to RBAC so you can restrict how users run their pods.
In other words, you can force your users to run pods only as 'youruser' and disable the privileged
flag in the pod securityContext
. You can also disable sudo
and in your container images.
Furthermore, you can drop Linux Capabilities, specifically CAP_SETUID
. And even more advanced use a seccomp profile, use SElinux or an Apparmor profile.
Other alternatives to run untrusted workloads (in alpha as of this writing):
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With