I allocated resource to 1 pod only with 650MB/30% of memory (with other built-in pods, limit memory is 69% only)
However, when the pod handling process, the usage of pod is within 650MB but overall usage of node is 94%.
Why does it happen because it supposed to have upper limit of 69%? Is it due to other built-in pods which did not set limit? How to prevent this as sometimes my pod with error if usage of Memory > 100%?
My allocation setting (kubectl describe nodes
):
Memory usage of Kubernetes Node and Pod when idle:kubectl top nodes
kubectl top pods
Memory usage of Kubernetes Node and Pod when running task:kubectl top nodes
kubectl top pods
Further Tested behaviour:
1. Prepare deployment, pods and service under namespace test-ns
2. Since only kube-system and test-ns have pods, so assign 1000Mi to each of them (from kubectl describe nodes
) aimed to less than 2GB
3. Suppose memory used in kube-system and test-ns will be less than 2GB which is less than 100%, why memory usage can be 106%?
In .yaml file:
apiVersion: v1
kind: LimitRange
metadata:
name: default-mem-limit
namespace: test-ns
spec:
limits:
- default:
memory: 1000Mi
type: Container
---
apiVersion: v1
kind: LimitRange
metadata:
name: default-mem-limit
namespace: kube-system
spec:
limits:
- default:
memory: 1000Mi
type: Container
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: devops-deployment
namespace: test-ns
labels:
app: devops-pdf
spec:
selector:
matchLabels:
app: devops-pdf
replicas: 2
template:
metadata:
labels:
app: devops-pdf
spec:
containers:
- name: devops-pdf
image: dev.azurecr.io/devops-pdf:latest
imagePullPolicy: Always
ports:
- containerPort: 3000
resources:
requests:
cpu: 600m
memory: 500Mi
limits:
cpu: 600m
memory: 500Mi
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Service
metadata:
name: devops-pdf
namespace: test-ns
spec:
type: LoadBalancer
ports:
- port: 8007
selector:
app: devops-pdf
If "Allocatable" is used to a node total memory size, under high memory pressure or pre-reserved memory value is bigger, the "MEMORY%" can be bigger than 100%. For suppressing the confusing, add a option to show node real memory usage based on "Capacity". * Reference: kubernetes#86499
On the other hand, if the memory use was sudden or unexpected, it may indicate a memory leak and you should start debugging immediately. Remember, Kubernetes killing a pod like that is a good thing—it prevents all the other pods from running on the same node. Kubernetes uses memory requests to determine on which node to schedule the pod.
Kubernetes uses memory requests to determine on which node to schedule the pod. For example, on a node with 8 GB free RAM, Kubernetes will schedule 10 pods with 800 MB for memory requests, five pods with 1600 MB for requests, or one pod with 8 GB for request, etc.
In Kubernetes, the CPU is not assigned in percentages, but in thousands (also called millicores or millicpu). One CPU is equal to 1000 millicores. If you wish to assign a third of a CPU, you should assign 333Mi (millicores) to your container. Memory is a bit more straightforward, and it is measured in bytes.
This effect is most likely caused by the 4 Pods that run on that node without a memory limit specified, shown as 0 (0%)
. Of course 0 doesn't mean it can't use even a single byte of memory as no program can be started without using memory; instead it means that there is no limit, it can use as much as available. Also programs running not in pod (ssh, cron, ...) are included in the total used figure, but are not limited by kubernetes (by cgroups).
Now kubernetes sets up the kernel oom adjustment values in a tricky way to favour containers that are under their memory request, making it more more likely to kill processes in containers that are between their memory request and limit, and making it most likely to kill processes in containers with no memory limits. However, this is only shown to work fairly in the long run, and sometimes the kernel can kill your favourite process in your favourite container that is behaving well (using less than its memory request). See https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#node-oom-behavior
The pods without memory limit in this particular case are coming from the aks system itself, so setting their memory limit in the pod templates is not an option as there is a reconciler that will restore it (eventually). To remedy the situation I suggest that you create a LimitRange object in the kube-system namespace that will assign a memory limit to all pods without a limit (as they are created):
apiVersion: v1
kind: LimitRange
metadata:
name: default-mem-limit
namespace: kube-system
spec:
limits:
- default:
memory: 150Mi
type: Container
(You will need to delete the already existing Pods without a memory limit for this to take effect; they will get recreated)
This is not going to completely eliminate the problem as you might end up with an overcommitted node; however the memory usage will make sense and the oom events will be more predictable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With