Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to maintain state in Erlang?

Tags:

erlang

I have seen people use dict, ordict, record for maintaining state in many blogs that I have read. I find it as very vital concept.

Generally I understand the meaning of maintaining state and recursions but when it comes to Erlang..I am a little vague about how it is handled.

Any help?

like image 670
HIRA THAKUR Avatar asked Nov 03 '14 11:11

HIRA THAKUR


3 Answers

State is the present arrangement of data. It is sometimes hard to remember this for two reasons:

  • State means both the data in the program and the program's current point of execution and "mode".
  • We build this up to be some magical thing unnecessarily.

Consider this:

"What is the process's state?" is asking about the present value of variables.

"What state is the process in?" usually refers to the mode, options, flags or present location of execution.

If you are a Turing machine then these are the same question; we have separated the ideas to give us handy abstractions to build on (like everything else in programming).

Let's think about state variables for a moment...

In many older languages you can alter state variables from whatever context you like, whether the modification of state is appropriate or not, because you manage this directly. In more modern languages this is a bit more restricted by imposing type declarations, scoping rules and public/private context to variables. This is really a rules arms-race, each language finding more ways to limit when assignment is permitted. If scheduling is the Prince of Frustration in concurrent programming, assignment is the Devil Himself. Hence the various cages built to manage him.

Erlang restricts the situations that assignment is permitted in a different way by setting the basic rule that assignment is only once per entry to a function, and functions are themselves the sole definition of procedural scope, and that all state is purely encapsulated by the executing process. (Think about the statement on scope to understand why many people feel that Erlang macros are a bad thing.)

These rules on assignment (use of state variables) encourage you to think of state as discreet slices of time. Every entry to a function starts with a clean slate, whether the function is recursive or not. This is a fundamentally different situation than the ongoing chaos of in-place modifications made from anywhere to anywhere in most other languages. In Erlang you never ask "what is the value of X right now?" because it can only ever be what it was initially assigned to be in the context of the current run of the current function. This significantly limits the chaos of state changes within functions and processes.

The details of those state variables and how they are assigned is incidental to Erlang. You already know about lists, tuples, ETS, DETS, mnesia, db connections, etc. Whatever. The core idea to understand about Erlang's style is how assignment is managed, not the incidental details of this or that particular data type.

What about "modes" and execution state?

If we write something like:

has_cheeseburger(BurgerName) ->
  receive
    {From, ask, burger_name} ->
        From ! {ok, BurgerName},
        has_cheeseburger(BurgerName);
    {From, new_burger, _SomeBurger} ->
        From ! {error, already_have_a_burger},
        has_cheeseburger(BurgerName);
    {From, eat_burger} ->
        From ! {ok, {ate, BurgerName}},
        lacks_cheeseburger()
  end.

lacks_cheeseburger() ->
  receive
    {From, ask, burger_name} ->
        From ! {error, no_burger},
        lacks_cheeseburger();
    {From, new_burger, BurgerName} ->
        From ! {ok, thanks},
        has_cheeseburger(BurgerName);
    {From, eat_burger} ->
        From ! {error, no_burger},
        lacks_cheeseburger()
  end.

What are we looking at? A loop. Conceptually its just one loop. Quite often a programmer would choose to write just one loop in code and add an argument like IsHoldingBurger to the loop and check it after each message in the receive clause to determine what action to take.

Above, though, the idea of two operating modes is both more explicit (its baked into the structure, not arbitrary procedural tests) and less verbose. We have separated the context of execution by writing basically the same loop twice, once for each condition we might be in, either having a burger or lacking one. This is at the heart of how Erlang deals with a concept called "finite state machines" and its really useful. OTP includes a tool build around this idea in the gen_fsm module. You can write your own FSMs by hand as I did above or use gen_fsm -- either way, when you identify you have a situation like this writing code in this style makes reasoning much easier. (For anything but the most trivial FSM you will really appreciate gen_fsm.)

Conclusion

That's it for state handling in Erlang. The chaos of untamed assignment is rendered impotent by the basic rules of single-assignment and absolute data encapsulation within each process (this implies that you shouldn't write gigantic processes, by the way). The supremely useful concept of a limited set of operating modes is abstracted by the OTP module gen_fsm or can be rather easily written by hand.

Since Erlang does such a good job limiting the chaos of state within a single process and makes the nightmare of concurrent scheduling among processes entirely invisible, that only leaves one complexity monster: the chaos of interactions among loosely coupled actors. In the mind of an Erlanger this is where the complexity belongs. The hard stuff should generally wind up manifesting there, in the no-man's-land of messages, not within functions or processes themselves. Your functions should be tiny, your needs for procedural checking relatively rare (compared to C or Python), your need for mode flags and switches almost nonexistant.

Edit

To reiterate Pascal's answer, in a super limited way:

loop(State) ->
  receive
    {async, Message} ->
        NewState = do_something_with(Message),
        loop(NewState);
    {sync, From, Message} ->
        NewState = do_something_with(Message),
        Response = process_some_response_on(NewState),
        From ! {ok, Response},
        loop(NewState);
    shutdown ->
        exit(shutdown);
    Any ->
        io:format("~p: Received: ~tp~n", [self(), Any]),
        loop(State)
  end.

Re-read tkowal's response for the most minimal version of this. Re-read Pascal's for an expansion of the same idea to include servicing messages. Re-read the above for a slightly different style of the same pattern of state handling with the addition of ouputting unexpected messages. Finally, re-read the two-state loop I wrote above and you'll see its actually just another expansion on this same idea.

Remember, you can't re-assign a variable within the same iteration of a function but the next call can have different state. That is the extent of state handling in Erlang.

These are all variations on the same thing. I think you're expecting there to be something more, a more expansive mechanism or something. There is not. Restricting assignment eliminates all the stuff you're probably used to seeing in other languages. In Python you do somelist.append(NewElement) and the list you had now has changed. In Erlang you do NewList = lists:append(NewElement, SomeList) and SomeList is sill exactly the same as it used to be, and a new list has been returned that includes the new element. Whether this actually involves copying in the background is not your problem. You don't handle those details, so don't think about them. This is how Erlang is designed, and that leaves single assignment and making fresh function calls to enter a fresh slice of time where the slate has been wiped clean again.

like image 118
zxq9 Avatar answered Sep 28 '22 17:09

zxq9


The easiest way to maintain state is using gen_server behaviour. You can read more on Learn you some Erlang and in the docs.

gen_server is process, that can be:

  • initialised with given state,
  • can have defined synchronous and asynchronous callbacks (synchronous for querying the data in "request-response style" and asynchronous for changing the state with "fire and forget" style)

It also has couple of nice OTP mechanisms:

  • it can be supervised
  • it gives you basic logging
  • its code can be upgraded while the server is running without loosing the state
  • and so on...

Conceptually gen_server is an endless loop, that looks like this:

loop(State) ->
    NewState = handle_requests(State),
    loop(NewState).

where handle requests receives messages. This way all requests are serialised, so there are no race conditions. Of course it is a little bit more complicated to give you all the goodies, that I described.

You can choose what data structure you want to use for State. It is common to use records, because they have named fields, but since Erlang 17 maps can come in handy. This one depends on, what you want to store.

like image 37
tkowal Avatar answered Sep 28 '22 18:09

tkowal


Variable are not mutable, so when you want to have an evolution of state, you create a new variable, and later recall the same function with this new state as parameter.

This structure is meant for processes like server, there is no base condition as in the factorial usual example, generally there is a specific message to stop the server smoothly.

loop(State) ->
    receive
        {add,Item}     -> NewState = [Item|State], % create a new variable
                          loop(NewState); % recall loop with the new variable
        {remove,Item}  -> NewState = lists:filter(fun(X) -> X /= Item end,State) , % create a new variable
                          loop(NewState); % recall loop with the new variable
        {items,Pid}    -> Pid ! {items,State},
                          loop(State);
        stop           -> stopped; % this will be the stop condition
        _              -> loop(State) % ignoring other message may be interesting in a never ending loop
    end
like image 22
Pascal Avatar answered Sep 28 '22 16:09

Pascal