We are exposing a stateless Owin WebAPI hosted on all nodes in our service fabric cluster (instance count -1) on Azure. The WebAPI is meant for public consumption and should be highly available even in the face of upgrades to the internal services and the WebAPI itself. We have the Azure loadbalancer (LB) in front of the cluster that probes the cluster on port 80 using a TCP probe every 5 sec in order to determine which nodes can receive http traffic.
We are experiencing issues when upgrading the WebAPI, namely that the LB directs to a node that is upgrading but is not yet registered as offline by the probe. Service Fabric does not coordinate the upgrade process with the LB so there are no chance (and no API on the Azure LB) to take the node out of rotation while upgrading.
We are wondering how people are achieving highly available http services on Service Fabric on Azure. I'm hoping someone would comment on their general approach.
How about using HTTP probing in Azure LB and adding a health check endpoint like http://node:80/_health in Web API? This way you can controller if a node should handle traffic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With