Erlang course concurrency exercise: Can my answer be improved?

Question

I am doing this exercise from the erlang.org course:

2) Write a function which starts N processes in a ring, and sends a message M times around all the processes in the ring. After the messages have been sent the processes should terminate gracefully.

Here's what I've come up with:

-module(ring).
-export([start/2, node/2]).

node(NodeNumber, NumberOfNodes) ->
  NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
  NextNodeName = node_name(NextNodeNumber),
  receive
    CircuitNumber ->
      io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),
      LastNode = NodeNumber =:= NumberOfNodes - 1,
      NextCircuitNumber = case LastNode of
                           true ->
                             CircuitNumber - 1;
                           false ->
                             CircuitNumber
                         end,
      if
        NextCircuitNumber > 0 ->
          NextNodeName ! NextCircuitNumber;
        true ->
          ok
      end,
      if
        CircuitNumber > 1 ->
          node(NodeNumber, NumberOfNodes);
        true ->
          ok
      end
  end.

start(NumberOfNodes, NumberOfCircuits) ->
  lists:foreach(fun(NodeNumber) ->
                    register(node_name(NodeNumber),
                             spawn(ring, node, [NodeNumber, NumberOfNodes]))
                end,
                lists:seq(0, NumberOfNodes - 1)),
  node_name(0) ! NumberOfCircuits,
  ok.

node_name(NodeNumber) ->
  list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).

Here is its output:

17> ring:start(3, 2).
Node 0 Circuit 2
ok
Node 1 Circuit 2
Node 2 Circuit 2
Node 0 Circuit 1
Node 1 Circuit 1
Node 2 Circuit 1

If I actually knew Erlang, would could I do differently to improve this code? And specifically:

Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?
Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?

irritate · Accepted Answer

Welcome to Erlang! I hope you enjoy it as much as I do.

Is there any alternative to specifying a do-nothing "true" clause in the last two if statements?

You can just leave these off. I ran your code with this:

if NextCircuitNumber > 0 ->
  NextNodeName ! NextCircuitNumber
end,
if CircuitNumber > 1 ->
  node(NodeNumber, NumberOfNodes)
end

and it worked for me.

Am I indeed terminating gracefully? Is any special action required when ending a process which was registered?

Yes you are. You can verify this by running the i(). command. This will show you the list of processes, and if your registered processes weren't terminating, you would see alot of your registered processes left over like node0, node1, etc. You also would not be able to run your program a second time, because it would error trying to register an already registered name.

As far as other things you could do to improve the code, there is not much because your code is basically fine. One thing I might do is leave off the NextNodeName variable. You can just send a message directly to node_name(NextNodeNumber) and that works.

Also, you could probably do a bit more pattern matching to improve things. For example, one change I made while playing with your code was to spawn the processes by passing in the number of the last node (NumberOfNodes - 1), rather than passing the NumberOfNodes. Then, I could pattern match in my node/2 function header like this

node(LastNode, LastNode) ->
    % Do things specific to the last node, like passing message back to node0
    % and decrementing the CircuitNumber
node(NodeNumber, LastNode) ->
    % Do things for every other node.

That allowed me to clean up some of the case and if logic in your node function and make it all a little tidier.

Hope that helps, and good luck.

I GIVE CRAP ANSWERS · Answer

Lets walk through the code:

-module(ring).
-export([start/2, node/2]).

The name node is one I avoid because a node() in Erlang has the connotation of an Erlang VM running on some machine - usually several nodes run on several machines. I'd rather call it ring_proc or something such.

node(NodeNumber, NumberOfNodes) ->
   NextNodeNumber = (NodeNumber + 1) rem NumberOfNodes,
   NextNodeName = node_name(NextNodeNumber),

This is what we are trying to spawn, and we get a number to the next node and the name of the next node. Lets look at node_name/1 as an interlude:

node_name(NodeNumber) ->
   list_to_atom(lists:flatten(io_lib:format("node~w", [NodeNumber]))).

This function is a bad idea. You will be needing a local name which needs to be an atom, so you created a function that can create arbitrary such names. The warning here is that the atom table is not garbage collected and limited, so we should avoid it if possible. The trick to solve this problem is to pass the pids instead and build the ring in reverse. The final process will then tie the knot of the ring:

mk_ring(N) ->
  Pid = spawn(fun() -> ring(none) end),
  mk_ring(N, Pid, Pid).

mk_ring(0, NextPid, Initiator) ->
   Initiator ! {set_next, NextPid},
   Initiator;
mk_ring(N, NextPid, Initiator) ->
   Pid = spawn(fun() -> ring(NextPid) end),
   mk_ring(N-1, Pid, Initiator).

And then we can rewrite your start function:

start(NumberOfNodes, NumberOfCircuits) ->
  RingStart = mk_ring(NumberOfNodes)
  RingStart ! {operate, NumberOfCircuits, self()},
  receive
    done ->
        RingStart ! stop
  end,
  ok.

The Ring code is then something along the lines of:

ring(NextPid) ->
  receive
    {set_next, Pid} ->
        ring(Pid);
    {operate, N, Who} ->
        ring_ping(N, NextPid),
        Who ! done,
        ring(NextPid);
    ping ->
        NextPid ! ping,
        ring(NextPid);
    stop ->
        NextPid ! stop,
        ok
  end.

And to fire something around the ring N times:

ring_ping(0, _Next) -> ok;
ring_ping(N, Next) ->
  Next ! ping
  receive
    ping ->
      ring_ping(N-1, Next)
  end.

(None of this code has been tested by the way, so it may very well be quite wrong).

As for the rest of your code:

receive
  CircuitNumber ->
    io:format("Node ~p Circuit ~p~n", [NodeNumber, CircuitNumber]),

I'd tag the CircuitNumber with some atom: {run, CN}.

  LastNode = NodeNumber =:= NumberOfNodes - 1,
  NextCircuitNumber = case LastNode of
                       true ->
                         CircuitNumber - 1;
                       false ->
                         CircuitNumber
                     end,

This can be done with an if:

  NextCN = if NodeNumber =:= NumberOfNodes - 1 -> CN -1;
              NodeNumber =/= NumberOfNodes - 1 -> CN
           end,

The next part here:

  if
    NextCircuitNumber > 0 ->
      NextNodeName ! NextCircuitNumber;
    true ->
      ok
  end,
  if
    CircuitNumber > 1 ->
      node(NodeNumber, NumberOfNodes);
    true ->
      ok
  end

does need the true case, unless you never hit it. The process will crash if nothing matches in the if. It is often possible to rewire the code as to not rely that much on counting constructions, like the above code of mine hints.

Several trouble can be avoided with this code. One problem with the current code is that if something crashes in the ring, it gets broken. We can use spawn_link rather than spawn to link the ring together, so such errors will destroy the whole ring. Furthermore our ring_ping function will crash if it is sent a message while the ring is operating. This can be alleviated, the simplest way is probably to alter the state of the ring process such that it knows it is currently operating and fold ring_ping into ring. Finally, we should probably also link the initial spawn so we don't end up with a large ring that are live but no-one has a reference to. Perhaps we could register the initial process so it is easy to grab hold of the ring later.

The start function is also bad in two ways. First, we should use make_ref() to tag a unique message along and receive the tag, so another process can't be sinister and just send done to the start-process while the ring works. We should probably also add a monitor on the ring, while it is working. Otherwise we will never be informed, should be ring crash while we are waiting for the done message (with tag). OTP does both in its synchronous calls by the way.

Finally, finally: No, you don't have to clean up a registration.

David Weldon · Answer

My colleagues have made some excellent points. I'd also like to mention that the initial intent of the problem is being avoided by registering the processes instead of actually creating a ring. Here is one possible solution:

-module(ring).
-export([start/3]).
-record(message, {data, rounds, total_nodes, first_node}).

start(TotalNodes, Rounds, Data) ->
    FirstNode = spawn_link(fun() -> loop(1, 0) end),
    Message = #message{data=Data, rounds=Rounds, total_nodes=TotalNodes,
                       first_node=FirstNode},
    FirstNode ! Message, ok.

loop(Id, NextNode) when not is_pid(NextNode) ->
    receive
        M=#message{total_nodes=Total, first_node=First} when Id =:= Total ->
            First ! M,
            loop(Id, First);
        M=#message{} ->
            Next = spawn_link(fun() -> loop(Id+1, 0) end),
            Next ! M,
            loop(Id, Next)
    end;
loop(Id, NextNode) ->
    receive
        M=#message{rounds=0} ->
            io:format("node: ~w, stopping~n", [Id]),
            NextNode ! M;
        M=#message{data=D, rounds=R, total_nodes=Total} ->
            io:format("node: ~w, message: ~p~n", [Id, D]),
            if Id =:= Total -> NextNode ! M#message{rounds=R-1};
               Id =/= Total -> NextNode ! M
            end,
            loop(Id, NextNode)
    end.

This solution uses records. If you are unfamiliar with them, read all about them here.

Each node is defined by a loop/2 function. The first clause of loop/2 deals with creating the ring (the build phase), and the second clause deals with printing the messages (the data phase). Notice that all clauses end in a call to loop/2 except the rounds=0 clause, which indicates that the node is done with its task and should die. This is what is meant by graceful termination. Also note the hack used to tell the node that it's in the build phase - NextNode isn't a pid but rather an integer.

Erlang course concurrency exercise: Can my answer be improved?

Tags:

concurrency

erlang

Wayne Conrad

3 Answers

irritate

I GIVE CRAP ANSWERS

David Weldon

Recent Activity

Donate For Us

Erlang course concurrency exercise: Can my answer be improved?

Tags:

concurrency

erlang

Wayne Conrad

3 Answers

irritate

I GIVE CRAP ANSWERS

David Weldon

Related questions

Recent Activity

Donate For Us