Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I gracefully migrate open Websocket connections after a deployment slot swap?

In my web app when the user logs in to the app, their browser opens a Websocket to the server so that updates can be pushed down to the browser.

It's an ASP.NET Core Web App (self-hosted) running in Azure App Services. I'd like to use Azure's deployment slot swapping feature to push code updates to production with zero downtime deployments.

In the limited testing I've done, it looks like after a slot swap the Websocket connection stays open to the original slot the browser was connected to. (So if the browser's Websocket was connected to slot A, and we swapped slot A and B so that new connections go to slot B, the Websocket would still be open to the app running on slot A.)

At some point the old slot will be taken offline, which will forcibly close any open Websockets. I would prefer to re-open the Websocket to the new slot as gracefully as possible and as soon as possible after the slot swap, so that if I update the Websocket-related code all clients will be running the new code as soon as possible.

A sketch of how this might work:

  1. Slot swap takes place
  2. A notification is sent to the code running on the old slot
  3. Code running on the old slot pushes a Websocket message to reconnect
  4. On receiving the message, the browser opens a new Websocket connection (which will go to the new slot)
  5. When the connection succeeds, the browser closes the old Websocket

Is there a better way to do it?

How can the code running on the old slot know when it has been swapped?

Is handling this gracefully even possible? Or are there always going to be a bunch of race conditions?

like image 505
Michael Kropat Avatar asked Mar 16 '18 12:03

Michael Kropat


1 Answers

The WebSockets do stay connected, because ARR can only direct new requests to the "new" application. You'd see the same behavior if you were downloading a large file in the middle of a swap, for example.

The way I've handled this is to have my deployment system (Octopus) do the swap, and then make a request to the "old" application that notifies all of the connected WebSocket clients that they need to disconnect and reconnect. The clients will immediately disconnect, pick a random delay (to prevent thousands of reconnects at once) and then reconnect after that delay.

like image 135
Jeremy Avatar answered Oct 23 '22 01:10

Jeremy