Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rolling restarts are causing are app engine app to go offline. Is there a way to change the config to prevent that from happening?

About once a week our flexible app engine node app goes offline and the following line appears in the logs: Restarting batch of VMs for version 20181008t134234 as part of rolling restart. We have our app set to automatic scaling with the following settings:

runtime: nodejs
env: flex
beta_settings:
 cloud_sql_instances: tuzag-v2:us-east4:tuzag-db
automatic_scaling:
 min_num_instances: 1
 max_num_instances: 3
liveness_check:
 path: "/"
 check_interval_sec: 30
 timeout_sec: 4
 failure_threshold: 2
 success_threshold: 2
readiness_check:
 path: "/"
 check_interval_sec: 15
 timeout_sec: 4
 failure_threshold: 2
 success_threshold: 2
 app_start_timeout_sec: 300
resources:
 cpu: 1
 memory_gb: 1
 disk_size_gb: 10

I understand the rolling restarts of GCP/GAE, but am confused as to why Google isn't spinning up another VM before taking our primary one offline. Do we have to run with a min num of 2 instances to prevent this from happening? Is there a way I get configure my app.yaml to make sure another instance is spun up before it reboots the only running instance? After the reboot finishes, everything comes back online fine, but there's still 10 minutes of downtime, which isn't acceptable, especially considering we can't control when it reboots.

like image 223
Max Matthews Avatar asked Oct 29 '18 14:10

Max Matthews


People also ask

How do you stop an instance in App Engine?

Method 2: Stop Instance. Go into App Engine, Versions, and then click on STOP. Method 2 does not work. Stop is disabled and there is a tooltip saying that stop cannot be done for versions with manual or dynamic scaling.

What is a rolling restart of applications?

Show activity on this post. A rolling restart or ripplestart of applications is typically an operation that may be performed on applications that are deployed across multiple JVMs or application servers (for example, in a cluster) to incrementally stop and start applications on each JVM.

Does restarting one app in apppool restart the other?

Depending on how many websites you have using same AppPool, you will have AppDomain loaded to worker process for each application. So restarting one application does not mean the other which use the same worker will be restarted as well.

What happens when I restart a recycled application?

So restarting one application does not mean the other which use the same worker will be restarted as well. AppDomain of application that has been recycled will have it's AppDomain unloaded and reloaded to the worker process. This will cause loss of all in-memory data.

How to track application stop and start caused by appdomain recycling?

However Gloabal.asax provides application events to which you can add handlers and perform actions such as logging in order to track application stop and start caused by AppDomain recycling The Application_Init event is fired when an application initializes the first time.


1 Answers

We know that it is expected behaviour that Flexible instances are restarted on a weekly basis. Provided that health checks are properly configured and are not the issue, the recommendation is, indeed, to set up a minimum of two instances.

There is no alternative functionality in App Engine Flex, of which I am aware of, that raises a new instance to avoid downtime as a result of a weekly restart. You could try to run directly on Google Compute Engine instead of App Engine and manage updates and maintenance by yourself, perhaps that would suit your purpose better.

like image 198
alextru Avatar answered Oct 16 '22 21:10

alextru