I like the LWN article "Crash-only software" and I would like to learn more about crash-safe and fault-tolerant programming.
It is surprisingly hard to assure that the persistent state is consistent in fault situations. Here I do not even talk about distributed operations: That is hard on a single node, too: Even the normal Berkeley DB (BDB Data Store or BDB Concurrent Data Store) might have a destroyed database if the system crashes. Not only that high level application constraints are broken, the database might not be opened correctly if the system crashes.
What are good resources about crash-safe and fault-tolerant designs, approaches, and programming.
If the resources focus on C++ and POSIX environments, I would appreciate that.
The key benefit of fault tolerance is to minimize or avoid the risk of systems becoming unavailable due to a component error.
To make it a fault tolerant, we need to identify potential failures, which a system might encounter, and design counteractions. Each failure's frequency and impact on the system need to be estimated to decide which one a system should tolerate.
Erlang is a functional programming language which also has a runtime environment. It was built in such a way that it had integrated support for concurrency, distribution and fault tolerance.
Fault tolerance refers to the ability of a system (computer, network, cloud cluster, etc.) to continue operating without interruption when one or more of its components fail.
Akka is a framework for Java and Scala that is written with let-it-crash in mind. See this article and this presentation for an introduction to Actors and let-it-crash. It is also called Fail-Fast and worker/supervisor style.
Two good presentations on erlang is Systems that Never Stop (and Erlang) and Message Passing Concurrency in Erlang
Theron is a actor library for C++, I also think there is something in Boost also.
Also Erlang can call C or C++ code see this for a discussion. Java / Scala / Akka can also call C++ code.
(If you like C++ I suggest you to have a look at Scala, very nice language and better than Java if you come from C++.)
Also Jonas Boners presentation Scalability, Availability & Stability Patterns is a good presentation on the topic.
The Aktor model in languages Erlang and Scala the let it crash model. See this article.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With