About once a week our flexible app engine node app goes offline and the following line appears in the logs: Restarting batch of VMs for version 20181008t134234 as part of rolling restart.
We have our app set to automatic scaling with the following settings:
runtime: nodejs
env: flex
beta_settings:
cloud_sql_instances: tuzag-v2:us-east4:tuzag-db
automatic_scaling:
min_num_instances: 1
max_num_instances: 3
liveness_check:
path: "/"
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
readiness_check:
path: "/"
check_interval_sec: 15
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
resources:
cpu: 1
memory_gb: 1
disk_size_gb: 10
I understand the rolling restarts of GCP/GAE, but am confused as to why Google isn't spinning up another VM before taking our primary one offline. Do we have to run with a min num of 2 instances to prevent this from happening? Is there a way I get configure my app.yaml
to make sure another instance is spun up before it reboots the only running instance? After the reboot finishes, everything comes back online fine, but there's still 10 minutes of downtime, which isn't acceptable, especially considering we can't control when it reboots.
Method 2: Stop Instance. Go into App Engine, Versions, and then click on STOP. Method 2 does not work. Stop is disabled and there is a tooltip saying that stop cannot be done for versions with manual or dynamic scaling.
Show activity on this post. A rolling restart or ripplestart of applications is typically an operation that may be performed on applications that are deployed across multiple JVMs or application servers (for example, in a cluster) to incrementally stop and start applications on each JVM.
Depending on how many websites you have using same AppPool, you will have AppDomain loaded to worker process for each application. So restarting one application does not mean the other which use the same worker will be restarted as well.
So restarting one application does not mean the other which use the same worker will be restarted as well. AppDomain of application that has been recycled will have it's AppDomain unloaded and reloaded to the worker process. This will cause loss of all in-memory data.
However Gloabal.asax provides application events to which you can add handlers and perform actions such as logging in order to track application stop and start caused by AppDomain recycling The Application_Init event is fired when an application initializes the first time.
We know that it is expected behaviour that Flexible instances are restarted on a weekly basis. Provided that health checks are properly configured and are not the issue, the recommendation is, indeed, to set up a minimum of two instances.
There is no alternative functionality in App Engine Flex, of which I am aware of, that raises a new instance to avoid downtime as a result of a weekly restart. You could try to run directly on Google Compute Engine instead of App Engine and manage updates and maintenance by yourself, perhaps that would suit your purpose better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With