Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GKE - Auto repairing after adding a new nodepool

The GKE is auto repairing after the addition of a new nodepool. The nodepool size is 1. Machine type n1-standard-64. It's being in that repaired state for almost 30 min. No other cluster operation can be performed until the repair is done.

Please help me out if any of you faced a similar issue and resolved it.

Screenshot of the repair info on the cluster page

like image 230
Manoj vardhan reddy Avatar asked Oct 31 '25 09:10

Manoj vardhan reddy


1 Answers

well the reason why GKE starts auto-repairing a node, is because detected that your nodes are in a unhealthy state for a given time threshold. unhealthy state could mean:

  • A node reports a NotReady status on consecutive checks over the given time threshold (approximately 10 minutes).
  • A node does not report any status at all over the given time threshold (approximately 10 minutes).
  • A node's boot disk is out of disk space for an extended time period (approximately 30 minutes).

If GKE detects that a node requires repair, the node is drained and re-created. GKE waits one hour for the drain to complete. If the drain doesn't complete, the node is shut down and a new node is created.

You can always review the logs of the repairing nodes, to find what is the root cause.

And you can always disable auto-repairing, by running this lines in cloud shell or check the console instructions here

gcloud container node-pools update pool-name --cluster cluster-name
--zone compute-zone
--no-enable-autorepair

like image 100
Ismael Clemente Aguirre Avatar answered Nov 03 '25 07:11

Ismael Clemente Aguirre



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!