Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug the "didn't have free ports" error with settings `hostNetwork: true` and `NET_BIND_SERVICE`

I need some help on debugging the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. Can someone please help?

I am trying to run a pod on Mac (first) using Docker Desktop flavor of Kubernetes, and the version is 2.1.0.1 (37199). I'd like to try using hostNetwork mode because of its efficiency and the number of ports that need to be opened (in thousands). With only hostNetwork: true set, there is no error but I also don't see the ports being opened on the host, nor the host network interface inside the container. Since I also needs to open port 443, I added the capability of NET_BIND_SERVICE and that is when it started throwing the error.

I've run lsof -i inside the container (ubuntu:18.04) and then sudo lsof -i on my Mac, and I saw no conflict. Then, I've also looked at /var/lib/log/containers/kube-apiserver-docker-desktop_kube-system_kube-apiserver-*.log and I saw no clue. Thanks!

Additional Info: I've run the following inside the container:

# ss -nltp
State  Recv-Q  Send-Q     Local Address:Port      Peer Address:Port
LISTEN 0       5                0.0.0.0:10024          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=28))
LISTEN 0       5                0.0.0.0:2443           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=24))
LISTEN 0       5                0.0.0.0:10000          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=27))
LISTEN 0       50               0.0.0.0:6800           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=14))
LISTEN 0       1                0.0.0.0:6802           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=13))
LISTEN 0       50               0.0.0.0:443            0.0.0.0:*      users:(("pnnsvr",pid=1,fd=15))

Then, I ran netstat on my Mac (the host) and searched for those ports and I can't find a collision. I'm happy to supply the output of netstat (767 lines) if needed.

Here is the yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pnnsvr
  labels:
    app: pnnsvr
    env: dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pnnsvr
      env: dev
  template:
    metadata:
      labels:
        app: pnnsvr
        env: dev
    spec:
      hostNetwork: true
      containers:
      - name: pnnsvr
        image: dev-pnnsvr:0.92
        args: ["--root_ip=192.168.15.194"]
        # for using local images
        imagePullPolicy: Never
        ports:
        - name: https
          containerPort: 443
          hostPort: 443
        - name: cport6800tcp
          containerPort: 6800
          hostPort:  6800
          protocol: TCP
        - name: cport10000tcp
          containerPort: 10000
          hostPort: 10000
          protocol: TCP
        - name: cport10000udp
          containerPort: 10000
          hostPort: 10000
          protocol: UDP
        - name: cport10001udp
          containerPort: 10001
          hostPort: 10001
          protocol: UDP
        #test
        - name: cport23456udp
          containerPort: 23456
          hostPort: 23456
          protocol: UDP
        securityContext:
          capabilities:
            add:
              - SYS_NICE
              - NET_BIND_SERVICE
              - CAP_SYS_ADMIN
like image 434
skwokie Avatar asked Aug 22 '19 17:08

skwokie


2 Answers

I've accidentally resolved this and I've done it by bouncing the pod instead of using kubectl apply -f .... Soon after bouncing the pod, the new pod will become a go. My theory is that Kubernetes will bring up a new pod and get it all ready to go first before killing the old pod. Since the old pod still has the ports opened, the new pod will see those ports taken and thus the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports is triggered.

like image 158
skwokie Avatar answered Nov 09 '22 09:11

skwokie


I don't have possibility to setup this on docker for mac but it seems you should verify your ports in your Docker VM:

screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty

Please consider if you can change default node port range (--service-node-port-range portRange-Default: 30000-32767) Here you can find great post how to do it in docker for mac

Please remember that using the hostNetwork: true it is not good solution according to the best practices.

As per documentation:

Don’t specify a hostPort for a Pod unless it is absolutely necessary. When you bind a Pod to a hostPort, it limits the number of places the Pod can be scheduled, because each combination must be unique. If you don’t specify the hostIP and protocol explicitly, Kubernetes will use 0.0.0.0 as the default hostIP and TCP as the default protocol. Avoid using hostNetwork, for the same reasons as hostPort.

If you explicitly need to expose a Pod’s port on the node, consider using a NodePort Service before resorting to hostPort.

Configure a Security Context for a Pod or Container https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

To specify security settings for a Container, include the securityContext field in the Container manifest. The securityContext field is a SecurityContext object. Security settings that you specify for a Container apply only to the individual Container, and they override settings made at the Pod level when there is overlap. Container settings do not affect the Pod’s Volumes.

Please note in addition for securityContext for POD:

The runAsGroup field specifies the primary group ID of 3000 for all processes within any containers of the Pod. If this field is omitted, the primary group ID of the containers will be root(0)

please let me know if it helped.

like image 26
Mark Avatar answered Nov 09 '22 09:11

Mark