I'm currently running on AWS and use kube-aws/kube-spot-termination-notice-handler to intercept an AWS spot termination notice and gracefully evict the pods.
I'm reading this GKE documentation page and I see:
Preemptible instances terminate after 30 seconds upon receiving a preemption notice.
Going into the Compute Engine documentation, I see that a ACPI G2 Soft Off is sent 30 seconds before the termination happens but this issue suggests that the kubelet itself doesn't handle it.
So, how does GKE handle preemption? Will the node do a drain/cordon operation or does it just do a hard shutdown?
What is the main reason customers choose Preemptible VMs? To reduce cost. The per-hour price of preemptible VMs incorporates a substantial discount.
Preemptible instances behave the same as regular compute instances, but the capacity is reclaimed when it's needed elsewhere, and the instances are terminated. If your workloads are fault-tolerant and can withstand interruptions, then preemptible instances can reduce your costs.
A Preemptible VM (PVM) is a Google Compute Engine (GCE) virtual machine (VM) instance that can be purchased for a steep discount as long as the customer accepts that the instance will terminate after 24 hours.
If your apps are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on preemptible instances.
Yes you are right, so far there is no built in way to handle ACPI G2 Soft Off
.
Notice that if normal preemptible instance supports shutdown scripts (where you could introduce some kind of logic to perform drain/cordon), this is not the case if they are Kubernetes nodes:
Currently, preemptible VMs do not support shutdown scripts.
You can perform some test but quoting again from documentation:
You can simulate an instance preemption by stopping the instance.
And so far if you stop the instance, even if it is a Kubernetes node no action is taken to cordon/drain and gratefully remove the node from the cluster.
However this feature is still in beta therefore it is at its early stage of life and in this moment it is a matter of discussion if and how introduce this feature.
Disclaimer: I work For Google Cloud Platform Support
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With