Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Impossible to do a flawless Azure VIP swap without any SQL errors

Tags:

azure

When the site is being used by already a low amount of +5 users and I do a VIP swap I always get SQL errors, like:

System.Data.Entity.Core.EntityCommandExecutionException: An error occurred while executing the command definition. See the inner exception for details. ---> System.Data.SqlClient.SqlException: A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable)

I am using the SqlAzureExecutionStrategy. Also other requests during the VIP swap seem to take around ~30 seconds making the site really not responsive, despite that the staging environment is already fully warmed up.

  • Is there any way to prevent this?
  • Why don't I read much about this behaviour? Do others just don't care for a few broken requests and the slow 30s timeframe, or am I missing something essential in my Sql / EF settings?
like image 371
Dirk Boer Avatar asked Oct 19 '22 21:10

Dirk Boer


2 Answers

You are literally swapping out live code while a site is actively in production. You are definitely going to get a variety errors. There is always going to be a split second where the services do not exist. Same thing when you deploy a new sql schema.

What you need to do is realize that this is going to happen and engineer around it. A strong dev ops team really shines here. What I personally do is to do staged deployments in different geographies. I will have traffic manager point all traffic over to the west coast web servers while I upgrade east coast. Then I point all traffic to east coast while I update west coast. I then set traffic management back to normal.

To allocate traffic from east to west, typically a single endpoint is set up with Traffic manager, which points behind the scenes to the correct endpoint, acting like a proxy. Just remove the endpoint you are about to upgrade, which will force a redirect to a live site. When the upgrade is done, add the endpoint back in.

You should also ensure that your code handles errors well. For example if needing to sync with external systems and during an upgrade a single record is entered into your system but not the external, can the external system's state be rebuilt with your data, or vice versa.

like image 96
David Crook Avatar answered Oct 22 '22 23:10

David Crook


I finally took the step to switch to Azure Web Apps instead of Cloud Services and all these problems have been solved, and even more.

It took a few days of time but it's definitely worth it to switch.

Paying for accidently leaving a Staging server on is also a problem of the past.

like image 38
Dirk Boer Avatar answered Oct 23 '22 00:10

Dirk Boer