Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the conceptual difference between Service Discovery tools and Load Balancers that check node health?

Recently several service discovery tools have become popular/"mainstream", and I’m wondering under what primary use cases one should employ them instead of traditional load balancers.

With LBs, you cluster a bunch of nodes behind the balancer, and then clients make requests to the balancer, who then (typically) round robins those requests to all the nodes in the cluster.

With service discovery (Consul, ZK, etc.), you let a centralized “consensus” service determine what nodes for particular service are healthy, and your app connects to the nodes that the service deems as being healthy. So while service discovery and load balancing are two separate concepts, service discovery gives you load balancing as a convenient side effect.

But, if the load balancer (say HAProxy or nginx) has monitoring and health checks built into it, then you pretty much get service discovery as a side effect of load balancing! Meaning, if my LB knows not to forward a request to an unhealthy node in its cluster, then that’s functionally equivalent to a consensus server telling my app not to connect to an unhealty node.

So to me, service discovery tools feel like the “6-in-one,half-dozen-in-the-other” equivalent to load balancers. Am I missing something here? If someone had an application architecture entirely predicated on load balanced microservices, what is the benefit (or not) to switching over to a service discovery-based model?

like image 311
smeeb Avatar asked Sep 01 '15 14:09

smeeb


2 Answers

Load balancers typically need the endpoints of the resources it balances the traffic load. With the growth of microservices and container based applications, runtime created dynamic containers (docker containers) are ephemeral and doesnt have static end points. These container endpoints are ephemeral and they change as they are evicted and created for scaling or other reasons. Service discovery tools like Consul are used to store the endpoints info of dynamically created containers (docker containers). Tools like consul-registrator running on container hosts registers container end points in service discovery tools like consul. Tools like Consul-template will listen for changes to containers end points in consul and update the load balancer (nginx) for sending the traffic to. Thus both Service Discovery Tools like Consul and Load Balancing tools like Nginx co-exist to provide runtime service discovery and load balancing capability respectively.

Follow up: what are the benefits of ephemeral nodes (ones that come and go, live and die) vs. "permanent" nodes like traditional VMs?

[DDG]: Things that come quickly to my mind: Ephemeral nodes like docker containers are suited for stateless services like APIs etc. (There is traction for persistent containers using external volumes - volume drivers etc)

  1. Speed: Spinning up or destroying ephemeral containers (docker containers from image) takes less than 500 milliseconds as opposed to minutes in standing up traditional VMs

  2. Elastic Infrastructure: In the age of cloud we want to scale out and in according to users demand which implies there will be be containers of ephemeral in nature (cant hold on to IPs etc). Think of a markerting campaign for a week for which we expect 200% increase in traffic TPS, quickly scale with containers and then post campaign, destroy it.

  3. Resource Utilization: Data Center or Cloud is now one big computer (compute cluster) and containers pack the compute cluster for max resource utilization and during weak demand destroy the infrastructure for lower bill/resource usage.

Much of this is possible because of lose coupling with ephemeral containers and runtime discovery using service discovery tool like consul. Traditional VMs and tight binding of IPs can stifle this capability.

like image 70
DDG Avatar answered Sep 21 '22 20:09

DDG


Note that the two are not necessarily mutually exclusive. It is possible, for example, that you might still direct clients to a load balancer (which might perform other roles such as throttling) but have the load balancer use a service registry to locate instances.

Also worth pointing out that service discovery enables client-side load balancing i.e. the client can invoke the service directly without the extra hop through the load balancer. My understanding is that this was one of the reasons that Netflix developed Eureka, to avoid inter-service calls having to go out and back through the external ELB for which they would have had to pay. Client-side load balancing also provides a means for the client to influence the load-balancing decision based on its own perspective of service availability.

like image 33
David Currie Avatar answered Sep 17 '22 20:09

David Currie