Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Erlang: Best way for a singleton gen_server in erlang cluster?

Setting:

I want to start a unique global registered gen_server process in an erlang cluster. If the process is stopped or the node running it goes down, the process is to be started on one of the other nodes.

The process is part of a supervisor. The problem is that starting the supervisor on a second node fails because the gen_server is already running and registerd globally from the first node.

Question(s):

  • Is it ok to check if the process is already globally registered inside gen_server's start_link function and in this case return {ok, Pid} of the already running process instead of launching a new gen_server instance?
  • Is it correct, that this way the one process would be part of multiple supervisors and if the one process goes down all supervisors on all other nodes would try to restart the process. The first supervisor would create a new gen_server process and the other supervisors would all link to that one process again.
  • Should I use some sort of global:trans() inside the gen_server's start_link function?

Example Code:


start_link() ->
    global:trans({?MODULE, ?MODULE}, fun() ->
        case gen_server:start_link({global, ?MODULE}, ?MODULE, [], []) of
            {ok, Pid} -> 
                {ok, Pid};
            {error, {already_started, Pid}} ->  
                link(Pid), 
                {ok, Pid};
            Else -> Else
        end     
    end).

like image 436
Rumpelstilz Avatar asked Dec 16 '10 09:12

Rumpelstilz


2 Answers

If you return {ok, Pid} of something you don't link to it will confuse a supervisor that relies on the return value. If you're not going to have a supervisor use this as a start_link function you can get away with it.

Your approach seems like it should work as each node will try to start a new instance if the global one dies. You may find that you need to increase the MaxR value in your supervisor setup as you'll get process messages every time the member of the cluster changes.

One way I've created global singletons in the past is to run the process on all the nodes, but have one of them (the one that wins the global registration race) be the master. The other processes monitor the master and when the master exits, try to become the master. (And again, if they don't win the registration race then they monitor the pid of the one that did). If you do this, you have to handle the global name registration yourself (i.e. don't use the gen_server:start({global, ... functionality) because you want the process to start whether or not it wins the registration, it will simply behave differently in each case.

The process itself must be more complicated (it has to run in both master and non-master modes), but it stabilizes quickly and doesn't produce a lot of log spam with supervisor start attempts.

My method usually requires a few rounds of revision to shake out the corner cases, but is to my mind less hassle than writing an OTP Distributed Application. This method has another advantage over distributed applications in that you don't have to statically configure the list of nodes involved in your cluster - any node can be a candidate for running the master copy of the process. Your approach has this same property.

like image 182
archaelus Avatar answered Nov 15 '22 19:11

archaelus


How about turning the gen_server into an application and using distributed applications?

like image 37
puzza007 Avatar answered Nov 15 '22 19:11

puzza007