I am reading Spring Cloud and NetFlix APIs. Many places, I read Fault Tolerance and Fault Resilience keyword.
Please explain the difference.
These distinctions are important, because it is possible to regard a fault tolerant service as suffering no down time even if the machine it is running on crashes, whereas the potential data fault in a fault resilient service counts toward down time.
Microservice-based applications are resilient when they can continue operating if there is a failure or error in some part of the system. Fault tolerance helps applications fail fast and recover smoothly by guiding how and when certain requests occur and by providing fallback strategies to handle common errors.
While high availability and fault tolerance are exclusively technology-centric, disaster recovery encompasses much more than just software/hardware elements. HA and FT focus on addressing the isolated failures in an IT system.
Fault tolerance refers to the ability of a system (computer, network, cloud cluster, etc.) to continue operating without interruption when one or more of its components fail.
Fault tolerance: User does not see any impact except for some delay during which failover occurs.
Fault resilience: Failure is observed in some services. But rest of system continues to function normally.
The Fault Tolerant means the ability of an architecture to survive (tolerate) when an environment misbehaves by taking corrective actions, e.g, surviving a server crash or preventing a misbehaving API from bringing down the whole system, etc. The Fault Resilience is probably the capacity to recover from these type of scenarios quickly.
After further reading of Netflix blogs and wikis, it seemed the terms Fault Resilience and Fault Tolerant were used interchangeably.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With