Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you structure a stateful module in Haskell?

Tags:

haskell

I'm looking to write a generic module that allows Haskell programs to interact with Cassandra. The module will need to maintain its own state. For example, it will have a connection pool and a list of callbacks to be invoked when a new record is saved. How should I structure the code so that this module can maintain its state? Here are some of the approaches I've been considering. Am I on the right track? (I'm new to Haskell and still learning the best ways to think functionally.)

Option 1:

The module runs in a (StateT s IO) monad, where s is the global state for the entire program using the Cassandra module. Of course, since the Cassandra module could be used by multiple programs, the details of what's in s should be invisible to the Cassandra module. The module would have to export a type class that allowed it to extract the CassandraState from s and push a new CassandraState back into s. Then, any program using the module would have to make its main state a member of this type class.

Option 2:

The module runs in a (StateT CassandraState IO) monad. Every time someone calls an action in the module, they would have to extract the CassandraState from wherever they have it stashed off, invoke the action with runState, and take the resulting state and stash it off again (wherever).

Option 3:

Don't put the Cassandra module's functions in a StateT monad at all. Instead, have the caller explicitly pass in CassandraState's when needed. The problem with option 2 is that not all of the functions in the module will modify the state. For example, obtaining a connection will modify the state and will require the caller to stash off the resulting state. But, saving a new record needs to read the state (to get the callbacks), but it doesn't need to change the state. Option 2 doesn't give the caller any hint that connect changes the state while create doesn't.

But, if I move away from using the StateT monad and just have functions that take in states as parameters and return either simple values or tuples of simple values and new states, then it's really obvious to the caller when the state needs to be saved off. (Under the covers in my module, I'd take the incoming states and build them into a (StateT CassandraState IO) monad, but the details of this would be hidden from the caller. So, to the caller, the interface is very explicit, but under the covers, it's just Option 2.)

Option 4:

Something else?

This problem must come up quite often when building reusable modules. Is there some sort of standard way to solve it?

(By the way, if someone knows a better way to interact with Cassandra from Haskell than using Thrift, please let me know! Maybe I don't have to write this at all. :-)

like image 457
Clint Miller Avatar asked Jan 24 '11 18:01

Clint Miller


2 Answers

Something like the HDBC model would be to have an explicit CassandraConnection data type. It has an MVar inside with some mutable state. Since all your actions are in IO anyway I'd imagine, they can just take the CassandraConnection as an argument to these actions. The user then can pack that connection into a state or reader monad, or thread it explicitly, or do whatever they want.

Internally you can use a monad or not -- that's really your call. However, I favor APIs that when possible don't force users into any particular monad unless truly necessary.

So this is a sort of version of option 3. But the user shouldn't really care whether or not they're changing the connection state -- at that level you can really hide the details from them.

like image 51
sclv Avatar answered Nov 15 '22 16:11

sclv


I'd go with Option 2. Users of your module shouldn't use runState directly; instead, you should provide an opaque Cassandra type with an instance of the Monad typeclass and some runCassandra :: Cassandra a -> IO a operation to "escape" Cassandra. The operations exported by your module should all run in the Cassandra monad (e.g. doSomethingInterestingInCassandra :: Int -> Bool -> Cassandra Char), and their definition can access the wrapped CassandraState.

If your users need some additional state for their application, they can always wrap a monad transformer around Cassandra, e.g. StateT MyState Cassandra.

like image 4
Cactus Avatar answered Nov 15 '22 16:11

Cactus