Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fail fast or fail safe?

Tags:

architecture

I'm working on a small project involving the creation of a console program intended to be run in background by a larger product.

The program is supposed to talk with the main product (IP21) on one hand and act as a server, handling several clients, on the other.

I've started working on the architecture and came up with something based on a reactor handling events such as connections or events generated by the main product. The client handling part is taken care of in separate threads, one per client.

But I disagree with a colleague of mine on this architecture. He says I should put the reactor and the few other components running in the main thread, in a separate thread. The main thread should be as simple as possible. I'd do that so that the program doesn't crash if this part does. He says it's better to have a completely non-functional program than a violent crash.

I say it's better to fail fast. If this (critical) part of the program crashes there is no reason to try keeping it alive. Moreover I believe it can cause trouble to the user; He will notice something is wrong but if he looks at the task list (our product as some kind of task manager which lists the tasks supposed to be running and allow to easily track a crashed one) he won't notice the program crashed!

I hope you can help us by giving some arguments to one side or the other ;)

edit: thanks for your answers but what we disagree on is about the usefulness of putting the reactor and a few other components in a separate thread in case of serious programming related problem (a segfault/deadlock/<insert critical problem here>). I think it would be both dangerous and pointless to have the program running without this thread.

like image 442
f4. Avatar asked Aug 05 '10 10:08

f4.


1 Answers

Use Proactor pattern :)

Fail safe. But it depends on tasks (critical or not) and user' loyalty. Stability. You lost one user or all.

like image 193
garik Avatar answered Oct 27 '22 01:10

garik