I start openshift locally using
oc cluster up
Then I create a pod using hello-pod.json with this command
oc create -f examples/hello-openshift/hello-pod.json
The pod is created but it can't start. Openshift shows an error:
Reason: Failed Scheduling
Message: 0/1 nodes are available: 1 NodeUnderDiskPressure.
I still have plenty of free space on my hard drive. I don't know where to look for other log. How to fix the problem?
You can then run a command, such as curl , against the application from inside the container: $ oc get pods NAME READY STATUS RESTARTS AGE nbviewer-1-wgzdb 1/1 Running 0 3h nbviewer-debug 1/1 Running 0 1m $ oc rsh nbviewer-debug (app-root)sh-4.2$ curl $HOSTNAME:8080 ...
If a container on a pod fails and the restart policy is set to OnFailure , the pod stays on the node and the container is restarted. If you do not want the container to restart, use a restart policy of Never . If an entire pod fails, OpenShift Container Platform starts a new pod.
In my case an adjustment of node-config.yaml
fixed the issue:
1) Search for the generated file node-config.yaml
e.g. under /var/lib/origin/
or your custom config path.
2) Open in editor and search the kubeletArguments
and add your wished disk eviction police:
kubeletArguments:
eviction-hard:
- memory.available<100Mi
- nodefs.available<1%
- nodefs.inodesFree<1%
- imagefs.available<1%
A detailed description can be found here: OpenShift Documentation - Default Hard Eviction Thresholds
Basically I just had to restore the fileSystem for docker and kubernetes configuration in my home user directory.
$ oc cluster down
$ sudo systemctl stop docker
$ sudo rm -rf /var/lib/docker
$ rm -rf ~/.kube
$ sudo systemctl start docker
$ oc cluster up
DONE! -- I was able to create pods after this.
Here are some other things that I had tried while identifying the same NodeUnderDiskPressure
that might help you if this doesn't solve the problem:
First I retrieved the available nodes from kubectl by:
$ oc login -u system:admin
$ kubectl get nodes
NAME STATUS AGE VERSION
localhost Ready 12h v1.7.6+a08f5eeb62
Next I retrieved the description for the localhost
node:
$ kubectl describe node localhost
Name: localhost
Role:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=localhost
Annotations: volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: <none>
CreationTimestamp: Mon, 05 Mar 2018 20:00:20 -0600
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Tue, 06 Mar 2018 08:09:03 -0600 Mon, 05 Mar 2018 20:00:20 -0600 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 06 Mar 2018 08:09:03 -0600 Mon, 05 Mar 2018 20:00:20 -0600 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure True Tue, 06 Mar 2018 08:09:03 -0600 Mon, 05 Mar 2018 20:00:31 -0600 KubeletHasDiskPressure kubelet has disk pressure
Ready True Tue, 06 Mar 2018 08:09:03 -0600 Mon, 05 Mar 2018 20:00:31 -0600 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.0.14
Hostname: localhost
Capacity:
cpu: 4
memory: 16311024Ki
pods: 40
Allocatable:
cpu: 4
memory: 16208624Ki
pods: 40
System Info:
Machine ID: 6895f77789824d26acef6d0db236319f
System UUID: 248A664C-33F8-11B2-A85C-FC31558EDC86
Boot ID: 1a5cc22b-81f1-4b07-b26f-917a7d17936f
Kernel Version: 4.13.16-100.fc25.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.12.6
Kubelet Version: v1.7.6+a08f5eeb62
Kube-Proxy Version: v1.7.6+a08f5eeb62
ExternalID: localhost
Non-terminated Pods: (0 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
0 (0%) 0 (0%) 0 (0%) 0 (0%)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
12h 8m 2877 kubelet, localhost Warning EvictionThresholdMet Attempting to reclaim imagefs
11h 3m 136 kubelet, localhost Warning ImageGCFailed (combined from similar events): wanted to free 3113113190 bytes, but freed 0 bytes space with errors in image deletion: [rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 933861786d39 (must be forced) - image is being used by stopped container 82eca7ad6fd6"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete bcccfe5352d3 (must be forced) - image is being used by stopped container 9c4ad3dc4b80"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete b7b0dbc4f785 (must be forced) - image is being used by stopped container d388fa17ff84"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 0129e5e73319 (cannot be forced) - image has dependent child images"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 725dcfab7d63 (must be forced) - image is being used by stopped container 9eb3a771aa6f"}, rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 8ec432b4cda3 (cannot be forced) - image is being used by running container a3fe6da22775"}]
There is a few things to note:
DiskPressure
Condition Status in True
Events
in warning: First I can see the EvictionThreshold
Attempting to reclaim imagefs; I can also see the ImageGCFailed
with details about the images that can't be disposed.Here a formatted JSON of the ImageGCFailed
message in my case:
(combined from similar events):wanted to free 3113113190 bytes,
but freed 0 bytes space with errors in image deletion:[
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete 933861786d39 (must be forced) - image is being used by stopped container 82eca7ad6fd6"
},
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete bcccfe5352d3 (must be forced) - image is being used by stopped container 9c4ad3dc4b80"
},
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete b7b0dbc4f785 (must be forced) - image is being used by stopped container d388fa17ff84"
},
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete 0129e5e73319 (cannot be forced) - image has dependent child images"
},
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete 725dcfab7d63 (must be forced) - image is being used by stopped container 9eb3a771aa6f"
},
rpc error: code = 2 desc = Error response from daemon:{
"message":"conflict: unable to delete 8ec432b4cda3 (cannot be forced) - image is being used by running container a3fe6da22775"
}
]
Based in this information: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#reclaiming-node-level-resources Now I investigate the available containers and try to remove them manually:
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a3fe6da22775 openshift/origin:v3.7.1 "/usr/bin/openshift s" 12 hours ago Up 12 hours origin
82eca7ad6fd6 dtf-bpms/nodejs-mongo-persistent-2:4e90f728 "/bin/sh -ic 'npm sta" 3 months ago Exited (137) 3 months ago openshift_s2i-build_nodejs-mongo-persistent-2_dtf-bpms_post-commit_fe89fcfd
9c4ad3dc4b80 dtf-bpms/nodejs-mongo-persistent-2:4e23c7d5 "/bin/sh -ic 'npm tes" 3 months ago Exited (137) 3 months ago openshift_s2i-build_nodejs-mongo-persistent-2_dtf-bpms_post-commit_de141bcd
d388fa17ff84 dtf-bpms/nodejs-mongo-persistent-1:439d35ea "/bin/sh -ic 'npm tes" 3 months ago Exited (137) 3 months ago openshift_s2i-build_nodejs-mongo-persistent-1_dtf-bpms_post-commit_277b19ca
9eb3a771aa6f hello-world "/hello" 3 months ago Exited (0) 3 months ago serene_babbage
Now I will to manually delete all stopped containers:
$ docker rm $(docker ps -a -q)
82eca7ad6fd6
9c4ad3dc4b80
d388fa17ff84
9eb3a771aa6f
Error response from daemon: You cannot remove a running container a3fe6da22775a559fe94ab0eb5f52d55d9aca6d1f950f107d13243fa029e071f. Stop the container before attempting removal or use -f
In this case it is fine to keep the openshift container.
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a3fe6da22775 openshift/origin:v3.7.1 "/usr/bin/openshift s" 12 hours ago Up 12 hours origin
Next I restart openshift and docker and try to create my containers again and described the localhost
node:
$ oc cluster down
$ sudo systemctl restart docker
$ oc cluster up
... (wait for cluster up start)
$ [CREATE PROJECT AND CONTAINERS]
$ oc login -u system:admin
$ kubectl describe node localhost
... (node description and header information)
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 1h 2 kubelet, localhost Normal NodeHasSufficientMemory Node localhost status is now: NodeHasSufficientMemory
1h 1h 2 kubelet, localhost Normal NodeHasNoDiskPressure Node localhost status is now: NodeHasNoDiskPressure
1h 1h 1 kubelet, localhost Normal NodeAllocatableEnforced Updated Node Allocatable limit across pods
1h 1h 2 kubelet, localhost Normal NodeHasSufficientDisk Node localhost status is now: NodeHasSufficientDisk
1h 1h 1 kubelet, localhost Normal NodeReady Node localhost status is now: NodeReady
1h 1h 1 kubelet, localhost Normal NodeHasDiskPressure Node localhost status is now: NodeHasDiskPressure
1h 1h 1 kubelet, localhost Warning ImageGCFailed wanted to free 2934625894 bytes, but freed 0 bytes space with errors in image deletion: rpc error: code = 2 desc = Error response from daemon: {"message":"conflict: unable to delete 8ec432b4cda3 (cannot be forced) - image is being used by running container 4bcd2196747c"}
You can see I continue to see the NodeHasDiskPressure
after cleaning up old unused container and images have been released from the Docker events. HERE IS WHERE NEXT STEP WAS TO DELETE THE OLD DIRTY DOCKER FILE SYSTEM AND START WITH A FRESH ONE.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With