In our Kuberenetes cluster, we are running into sporadic situations where a cluster node runs out of memory and Linux invokes OOM killer. Looking at the logs, it appears that the Pods scheduled onto the Node are requesting more memory than can be allocated by the Node.
The issue is that, when OOM killer is invoked, it prints out a list of processes and their memory usage. However, as all of our Docker containers are Java services, the "process name" just appears as "java", not allowing us to track down which particular Pod is causing the issues.
How can I get the history of which Pods were scheduled to run on a particular Node and when?
About seeing all the pods on a node, you can go with kubectl get events or docker ps -a on the node, as cited on the other answers/comments. Save this answer. Show activity on this post. One way is to see docker ps -a output and correlate the container names with your pod's containers.
As with Pods, you can use kubectl describe node and kubectl get node -o yaml to retrieve detailed information about nodes.
When kubectl describe pod does not show any information about an error, we can use another kubectl command, that is, logs . The kubectl logs command allows us to print container logs, and we can also view them in real time as well.
You can now use kube-state-metrics kube_pod_container_status_terminated_reason
to detect OOM events
kube_pod_container_status_terminated_reason{reason="OOMKilled"}
kube_pod_container_status_terminated_reason{container="addon-resizer",endpoint="http-metrics",instance="100.125.128.3:8080",job="kube-state-metrics",namespace="monitoring",pod="kube-state-metrics-569ffcff95-t929d",reason="OOMKilled",service="kube-state-metrics"}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With