Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minio tenant fails to connect to minio-operator and keeps crashing with timeouts

I am trying to create a minio tenant with minio operator.

Just to test, I'm using the simplest setup I can come up with: 1 server and 4 drives.

Name of the tanant is minio-tenant-1 and it's running on a separate namespace: minio-tenant-1. I also disabled TLS, prometheus monitoring and audit logging just in case.

The pod keeps timing out with this error:

ERROR Unable to validate passed arguments in MINIO_ARGS:env://zUXN5Ti5QxmT4fPxwQFB:5Kt2M5omOqmN2HUB0zFAfblI7DYY5PF8OZ8kfvdC@operator.minio-operator.svc.cluster.local:4222/webhook/v1/getenv/minio-tenant-1/minio-tenant-1: Get "http://operator.minio-operator.svc.cluster.local:4222/webhook/v1/getenv/minio-tenant-1/minio-tenant-1?key=MINIO_ARGS": dial tcp: lookup operator.minio-operator.svc.cluster.local: i/o timeout
$ k describe pod/minio-tenant-1-pool-0-0 -n minio-tenant-1
Name:             minio-tenant-1-pool-0-0
Namespace:        minio-tenant-1
Priority:         0
Service Account:  default
Node:             master/192.168.0.248
Start Time:       Fri, 24 Mar 2023 18:00:03 +0000
Labels:           controller-revision-hash=minio-tenant-1-pool-0-58bb9b869b
                  statefulset.kubernetes.io/pod-name=minio-tenant-1-pool-0-0
                  v1.min.io/console=minio-tenant-1-console
                  v1.min.io/pool=pool-0
                  v1.min.io/tenant=minio-tenant-1
Annotations:      min.io/revision: 0
Status:           Running
IP:               10.244.0.29
IPs:
  IP:           10.244.0.29
Controlled By:  StatefulSet/minio-tenant-1-pool-0
Containers:
  minio:
    Container ID:  cri-o://47a0fc9343c38439c09b96c65c0938c0b438c4fca5beeffca8917254a1af6e83
    Image:         minio/minio:RELEASE.2023-03-22T06-36-24Z
    Image ID:      docker.io/minio/minio@sha256:02b9d0234025a31b1fb2a52697b62a18a4fbd0db03cbd83dfc09fc48773b718c
    Ports:         9000/TCP, 9090/TCP
    Host Ports:    0/TCP, 0/TCP
    Args:
      server
      --certs-dir
      /tmp/certs
      --console-address
      :9090
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 24 Mar 2023 19:39:23 +0000
      Finished:     Fri, 24 Mar 2023 19:39:27 +0000
    Ready:          False
    Restart Count:  24
    Requests:
      cpu:     4
      memory:  8Gi
    Environment:
      MINIO_ARGS:                    <set to the key 'MINIO_ARGS' in secret 'operator-webhook-secret'>  Optional: false
      MINIO_CONFIG_ENV_FILE:         /tmp/minio-config/config.env
      MINIO_OPERATOR_VERSION:        v4.5.8
      MINIO_PROMETHEUS_JOB_ID:       minio-job
      MINIO_SERVER_URL:              http://minio.minio-tenant-1.svc.cluster.local:80
      MINIO_UPDATE:                  on
      MINIO_UPDATE_MINISIGN_PUBKEY:  RWTx5Zr1tiHQLwG9keckT0c45M3AGeHD6IvimQHpyRywVWGbP1aVSGav
    Mounts:
      /export0 from data0 (rw)
      /export1 from data1 (rw)
      /export2 from data2 (rw)
      /export3 from data3 (rw)
      /tmp/minio-config from configuration (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j468h (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  data0:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data0-minio-tenant-1-pool-0-0
    ReadOnly:   false
  data1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data1-minio-tenant-1-pool-0-0
    ReadOnly:   false
  data2:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data2-minio-tenant-1-pool-0-0
    ReadOnly:   false
  data3:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data3-minio-tenant-1-pool-0-0
    ReadOnly:   false
  configuration:
    Type:                Projected (a volume that contains injected data from multiple sources)
    SecretName:          minio-tenant-1-env-configuration
    SecretOptionalName:  <nil>
  kube-api-access-j468h:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                     From     Message
  ----     ------   ----                    ----     -------
  Warning  BackOff  2m52s (x456 over 102m)  kubelet  Back-off restarting failed container minio in pod minio-tenant-1-pool-0-0_minio-tenant-1(fbe54e33-bb94-4332-aa4b-ebb05277884e)

The node affinity for now is set to master, but I have other nodes that I could disperse the drives on. My understanding is that it shouldn't matter (but I might be wrong):

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - master

pv and pvc seem in order as well:

$ k get pv
NAME         CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                          STORAGECLASS    REASON   AGE
minio-pv-1   6Gi        RWO            Retain           Bound    minio-tenant-1/data2-minio-tenant-1-pool-0-0   local-storage            123m
minio-pv-2   6Gi        RWO            Retain           Bound    minio-tenant-1/data1-minio-tenant-1-pool-0-0   local-storage            123m
minio-pv-3   6Gi        RWO            Retain           Bound    minio-tenant-1/data0-minio-tenant-1-pool-0-0   local-storage            123m
minio-pv-4   6Gi        RWO            Retain           Bound    minio-tenant-1/data3-minio-tenant-1-pool-0-0   local-storage            123m
$ k get pvc
NAME                            STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS    AGE
data0-minio-tenant-1-pool-0-0   Bound    minio-pv-3   6Gi        RWO            local-storage   122m
data1-minio-tenant-1-pool-0-0   Bound    minio-pv-2   6Gi        RWO            local-storage   122m
data2-minio-tenant-1-pool-0-0   Bound    minio-pv-1   6Gi        RWO            local-storage   122m
data3-minio-tenant-1-pool-0-0   Bound    minio-pv-4   6Gi        RWO            local-storage   122m
$

The DNS resolution is fine as well:

$ k exec -it nginx-56b884b9cf-mz49v -n minio-tenant-1 -- sh
# curl -k http://operator.minio-operator.svc.cluster.local:4222
404 page not found

minio operator seems alive and I can access the console interface to create the tenant:

$ k -n minio-operator get all
NAME                                  READY   STATUS    RESTARTS   AGE
pod/console-56f9795d5c-48lxd          1/1     Running   0          16h
pod/minio-operator-86d96956d6-9z4ll   1/1     Running   0          150m
pod/minio-operator-86d96956d6-t6pdh   1/1     Running   0          151m

NAME               TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/console    ClusterIP   10.106.35.184    <none>        9090/TCP,9443/TCP   16h
service/operator   ClusterIP   10.100.234.145   <none>        4222/TCP,4221/TCP   150m

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/console          1/1     1            1           16h
deployment.apps/minio-operator   2/2     2            2           16h

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/console-56f9795d5c          1         1         1       16h
replicaset.apps/minio-operator-7cd6784f59   0         0         0       16h
replicaset.apps/minio-operator-86d96956d6   2         2         2       15h

The minio operator logs:

$ k -n minio-operator logs -f pod/minio-operator-86d96956d6-t6pdh
I0324 19:54:05.667811       1 event.go:285] Event(v1.ObjectReference{Kind:"Tenant", Namespace:"minio-tenant-1", Name:"minio-tenant-1", UID:"0b1bfd35-1cf9-4328-b95a-d41e6516ad1f", APIVersion:"minio.min.io/v2", ResourceVersion:"348247", FieldPath:""}): type: 'Warning' reason: 'UsersCreatedFailed' Users creation failed: Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:55:01.181291       1 monitoring.go:129] 'minio-tenant-1/minio-tenant-1' Failed to get cluster health: Get "http://minio.minio-tenant-1.svc.cluster.local/minio/health/cluster": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:55:04.199435       1 monitoring.go:129] 'minio-tenant-1/minio-tenant-1' Failed to get cluster health: Get "http://minio.minio-tenant-1.svc.cluster.local/minio/health/cluster": dial tcp 10.99.200.221:80: connect: connection refused
E0324 19:55:05.725779       1 main-controller.go:618] error syncing 'minio-tenant-1/minio-tenant-1': Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:55:05.725797       1 event.go:285] Event(v1.ObjectReference{Kind:"Tenant", Namespace:"minio-tenant-1", Name:"minio-tenant-1", UID:"0b1bfd35-1cf9-4328-b95a-d41e6516ad1f", APIVersion:"minio.min.io/v2", ResourceVersion:"348247", FieldPath:""}): type: 'Warning' reason: 'UsersCreatedFailed' Users creation failed: Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:55:09.428632       1 monitoring.go:129] 'minio-tenant-1/minio-tenant-1' Failed to get cluster health: Get "http://minio.minio-tenant-1.svc.cluster.local/minio/health/cluster": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:55:17.817331       1 monitoring.go:129] 'minio-tenant-1/minio-tenant-1' Failed to get cluster health: Get "http://minio.minio-tenant-1.svc.cluster.local/minio/health/cluster": dial tcp 10.99.200.221:80: connect: connection refused
E0324 19:56:05.828596       1 main-controller.go:618] error syncing 'minio-tenant-1/minio-tenant-1': Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:56:05.828627       1 event.go:285] Event(v1.ObjectReference{Kind:"Tenant", Namespace:"minio-tenant-1", Name:"minio-tenant-1", UID:"0b1bfd35-1cf9-4328-b95a-d41e6516ad1f", APIVersion:"minio.min.io/v2", ResourceVersion:"348247", FieldPath:""}): type: 'Warning' reason: 'UsersCreatedFailed' Users creation failed: Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
E0324 19:57:05.900952       1 main-controller.go:618] error syncing 'minio-tenant-1/minio-tenant-1': Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
I0324 19:57:05.901063       1 event.go:285] Event(v1.ObjectReference{Kind:"Tenant", Namespace:"minio-tenant-1", Name:"minio-tenant-1", UID:"0b1bfd35-1cf9-4328-b95a-d41e6516ad1f", APIVersion:"minio.min.io/v2", ResourceVersion:"348247", FieldPath:""}): type: 'Warning' reason: 'UsersCreatedFailed' Users creation failed: Put "http://minio.minio-tenant-1.svc.cluster.local/minio/admin/v3/add-user?accessKey=0k95Hbreiyn4qor1": dial tcp 10.99.200.221:80: connect: connection refused
$ k get netpol -A
No resources found
$ k minio version
v4.5.8

Ripping my hair right now, I'm sure I am missing something ridiculously stupid. Will greatly appreciate any help, thank you!

like image 564
Trevor Donahue Avatar asked Feb 02 '26 17:02

Trevor Donahue


1 Answers

This issue got solved in verion 5.x.x of MinIO Operator as it no longer uses webhook to obtain the MINIO_ARGS but rather there is a sidecar that provide those arguments and it does not fail this way. Please try with latest Operator version available. Setting this comment as Community wiki for people to improve accordingly.

like image 155
Cesar Celis Avatar answered Feb 05 '26 08:02

Cesar Celis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!