Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do complex IO processing and implicit cache in Haskell?

In bigger applications there are very often multiple layers of IO caching (Hibernate L1 and L2, Spring cache etc.) which usually are abstracted so that caller needs not to be aware that particular implementation does IO. With some caveats (scope, transactions), it allows for simpler interfaces between components.

For example, if component A needs to query database, it needs not to know whether result is already cached. It might have been retrieved by B or C which A knows nothing about, however they would usually participate in some session or transaction - often implicitly.

Frameworks tend to make this call indistinguishable from simple object method call using techniques like AOP.

Is it possible for Haskell applications to benefit like this? How would client's interface look like?

like image 829
Rumca Avatar asked Aug 21 '13 19:08

Rumca


2 Answers

In Haskell there are many ways to compose computations from components that represent their separate responsibilities. This can be done at the data level with data types and functions (http://www.haskellforall.com/2012/05/scrap-your-type-classes.html) or using type classes. In Haskell you can view every data type, type, function, signature, class, etc as an interface; as long as you have something else of the same type, you can replace a component with something that's compatible.

When we want to reason about computations in Haskell we frequently use the abstraction of a Monad. A Monad is an interface for constructing computations. A base computation can be constructed with return and these can be composed together with functions that produce other computations with >>=. When we want to add multiple responsibilities to computations represented by monads, we make monad transformers. In the code below, there are four different monad transformers that capture different aspects of a layered system:

DatabaseT s adds a database with a schema of type s. It handles data Operations by storing data in or retrieving it from the database. CacheT s intercepts data Operations for a schema s and retrieves data from memory, if it is available. OpperationLoggerT logs the Operations to standard output ResultLoggerT logs the results of Operations to standard output

These four components communicate together using a type class (interface) called MonadOperation s, which requires that components that implement it provide a way to perform an Operation and return its result.

This same type class described what is required to use the MonadOperation s system. It requires that someone using the interface provide implementations of type classes that the database and cache will rely on. There are also two data types that are part of this interface, Operation and CRUD. Notice that the interface doesn't need to know anything about the domain objects or database schema, nor does it need to know about the different monad transformers that will implement it. The monad transformers don't know anything about the schema or domain objects, and the domain objects and example code don't know anything about the monad transformers that build the system.

The only thing the example code knows is that it will have access to a MonadOperation s due to its type example :: (MonadOperation TableName m) => m ().

The program main runs the example twice in two different contexts. The first time, the program talks to the database, with its Operations and responses being logged to standard out.

Running example program once with an empty database
Operation Articles (Create (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."}))
    ArticleId 0
Operation Articles (Read (ArticleId 0))
    Just (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."})
Operation Articles (Read (ArticleId 0))
    Just (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."})

The second run logs the responses the program receives, passes Operations through the cache, and logs the requests before they reach the database. Due to the new caching, which is transparent to the program, the requests to read the article never happen, but the program still receives a response:

Running example program once with an empty cache and an empty database
Operation Articles (Create (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."}))
    ArticleId 0
    Just (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."})
    Just (Article {title = "My first article", author = "Cirdec", contents = "Lorem ipsum dolor sit amet."})

Here's the entire source code. You should think of it as four independent pieces of code: A program written for our domain, starting at example. An application that is the complete assembly of the program, the domain of discourse, and the various tools that build it, starting at main. The next two sections, ending with the schema TableName, describe a domain of blog posts; their only purpose is to illustrate how the other components go together, not to serve as an example for how to design data structures in Haskell. The next section describes a small interface by which components could communicate about data; it's not necessarily a good interface. Finally, the remainder of the source code implements the loggers, database, and caches that are composed together to form the application. In order to decouple the tools and interface from the domain, there are some somewhat hideous tricks with typeable and dynamics in here, this isn't meant to demonstrate a good way to handle casting and generics either.

{-# LANGUAGE StandaloneDeriving, GADTs, DeriveDataTypeable, FlexibleInstances, FlexibleContexts, GeneralizedNewtypeDeriving, MultiParamTypeClasses, ScopedTypeVariables,  KindSignatures, FunctionalDependencies, UndecidableInstances #-}

module Main (
    main
) where

import Data.Typeable
import qualified Data.Map as Map
import Control.Monad.State
import Control.Monad.State.Class
import Control.Monad.Trans
import Data.Dynamic

-- Example

example :: (MonadOperation TableName m) => m ()
example =
    do
        id <- perform $ Operation Articles $ Create $ Article {
            title = "My first article",
            author = "Cirdec",
            contents = "Lorem ipsum dolor sit amet."
        }
        perform $ Operation Articles $ Read id
        perform $ Operation Articles $ Read id
        cid <- perform $ Operation Comments $ Create $ Comment {
            article = id,
            user = "Cirdec",
            comment = "Commenting on my own article!"
        }

        perform $ Operation Equality $ Create False
        perform $ Operation Equality $ Create True
        perform $ Operation Inequality $ Create True
        perform $ Operation Inequality $ Create False

        perform $ Operation Articles $ List
        perform $ Operation Comments $ List
        perform $ Operation Equality $ List
        perform $ Operation Inequality $ List
        return ()

-- Run the example twice, changing the cache transparently to the code

main :: IO ()
main = do
    putStrLn "Running example program once with an empty database"
    runDatabaseT (runOpperationLoggerT (runResultLoggerT example)) Types { types = Map.empty }
    putStrLn "\nRunning example program once with an empty cache and an empty database"
    runDatabaseT (runOpperationLoggerT (runCacheT (runResultLoggerT example) Types { types = Map.empty })) Types { types = Map.empty }        
    return ()

-- Domain objects

data Article = Article {
    title :: String,
    author :: String,
    contents :: String

}
deriving instance Eq Article
deriving instance Ord Article
deriving instance Show Article
deriving instance Typeable Article

newtype ArticleId = ArticleId Int

deriving instance Eq ArticleId
deriving instance Ord ArticleId
deriving instance Show ArticleId
deriving instance Typeable ArticleId
deriving instance Enum ArticleId

data Comment = Comment {
    article :: ArticleId,
    user :: String,
    comment :: String
}

deriving instance Eq Comment
deriving instance Ord Comment
deriving instance Show Comment
deriving instance Typeable Comment

newtype CommentId = CommentId Int

deriving instance Eq CommentId
deriving instance Ord CommentId
deriving instance Show CommentId
deriving instance Typeable CommentId
deriving instance Enum CommentId

-- Database Schema

data TableName k v where
    Articles :: TableName ArticleId Article
    Comments :: TableName CommentId Comment
    Equality :: TableName Bool Bool
    Inequality :: TableName Bool Bool

deriving instance Eq (TableName k v)
deriving instance Ord (TableName k v)
deriving instance Show (TableName k v)
deriving instance Typeable2 TableName

-- Data interface (Persistance library types)

data CRUD k v r where
    Create :: v -> CRUD k v k
    Read :: k -> CRUD k v (Maybe v)
    List :: CRUD k v [(k,v)]
    Update :: k -> v -> CRUD k v (Maybe ())
    Delete :: k -> CRUD k v (Maybe ())

deriving instance (Eq k, Eq v) => Eq (CRUD k v r)
deriving instance (Ord k, Ord v) => Ord (CRUD k v r)
deriving instance (Show k, Show v) => Show (CRUD k v r)

data Operation s t k v r where
    Operation :: t ~ s k v => t -> CRUD k v r -> Operation s t k v r

deriving instance (Eq (s k v), Eq k, Eq v) => Eq (Operation s t k v r)
deriving instance (Ord (s k v), Ord k, Ord v) => Ord (Operation s t k v r)
deriving instance (Show (s k v), Show k, Show v) => Show (Operation s t k v r)

class (Monad m) => MonadOperation s m | m -> s where
    perform :: (Typeable2 s, Typeable k, Typeable v, t ~ s k v, Show t, Ord v, Ord k, Enum k, Show k, Show v, Show r) => Operation s t k v r -> m r

-- Database implementation

data Tables t k v = Tables {
    tables :: Map.Map String (Map.Map k v)
}

deriving instance Typeable3 Tables

emptyTablesFor :: Operation s t k v r -> Tables t k v
emptyTablesFor _ = Tables {tables = Map.empty} 

data Types = Types {
    types :: Map.Map TypeRep Dynamic
}

-- Database emulator

mapOperation :: (Enum k, Ord k, MonadState (Map.Map k v) m) => (CRUD k v r) -> m r
mapOperation (Create value) = do
    current <- get
    let id = case Map.null current of
            True -> toEnum 0
            _ -> succ maxId where
                (maxId, _) = Map.findMax current
    put (Map.insert id value current)
    return id
mapOperation (Read key) = do
    current <- get
    return (Map.lookup key current)
mapOperation List = do
    current <- get
    return (Map.toList current)
mapOperation (Update key value) = do
    current <- get
    case (Map.member key current) of
        True -> do
            put (Map.update (\_ -> Just value) key current)
            return (Just ())
        _ -> return Nothing
mapOperation (Delete key) = do
    current <- get
    case (Map.member key current) of
        True -> do
            put (Map.delete key current)
            return (Just ())
        _ -> return Nothing

tableOperation :: (Enum k, Ord k, Ord v, t ~ s k v, Show t,  MonadState (Tables t k v) m) => Operation s t k v r -> m r
tableOperation (Operation tableName op) = do
    current <- get
    let currentTables =  tables current
    let tableKey = show tableName
    let table = Map.findWithDefault (Map.empty) tableKey currentTables 
    let (result,newState) = runState (mapOperation op) table
    put Tables { tables = Map.insert tableKey newState currentTables }
    return result

typeOperation :: (Enum k, Ord k, Ord v, t ~ s k v, Show t, Typeable2 s, Typeable k, Typeable v, MonadState Types m) => Operation s t k v r -> m r
typeOperation op = do
    current <- get
    let currentTypes = types current
    let empty = emptyTablesFor op
    let typeKey = typeOf (empty)
    let typeMap = fromDyn (Map.findWithDefault (toDyn empty) typeKey currentTypes) empty
    let (result, newState) = runState (tableOperation op) typeMap
    put Types { types = Map.insert typeKey (toDyn  newState) currentTypes }
    return result

-- Database monad transformer (clone of StateT)

newtype DatabaseT (s :: * -> * -> *) m a = DatabaseT {
    databaseStateT :: StateT Types m a
}

runDatabaseT :: DatabaseT s m a -> Types -> m (a, Types)  
runDatabaseT = runStateT . databaseStateT

instance (Monad m) => Monad (DatabaseT s m) where
    return = DatabaseT . return
    (DatabaseT m) >>= k = DatabaseT (m >>= \x -> databaseStateT (k x))

instance MonadTrans (DatabaseT s) where
    lift = DatabaseT . lift

instance (MonadIO m) => MonadIO (DatabaseT s m) where
    liftIO = DatabaseT . liftIO      

instance (Monad m) => MonadOperation s (DatabaseT s m) where
    perform = DatabaseT . typeOperation

-- State monad transformer can preserve operations


instance (MonadOperation s m) => MonadOperation s (StateT state m) where
    perform = lift . perform

-- Cache implementation (very similar to emulated database)

cacheMapOperation :: (Enum k, Ord k, Ord v, t ~ s k v, Show t, Show k, Show v, Typeable2 s, Typeable k, Typeable v, MonadState (Map.Map k v) m, MonadOperation s m) =>  Operation s t k v r -> m r
cacheMapOperation op@(Operation _ (Create value)) = do
    key <- perform op
    modify (Map.insert key value)
    return key
cacheMapOperation op@(Operation _ (Read key)) = do
    current <- get
    case (Map.lookup key current) of
        Just value -> return (Just value) 
        _ -> do
            value <- perform op
            modify (Map.update (\_ -> value) key)
            return value
cacheMapOperation op@(Operation _ (List)) = do
    values <- perform op
    modify (Map.union (Map.fromList values))
    current <- get
    return (Map.toList current)
cacheMapOperation op@(Operation _ (Update key value)) = do
    successful <- perform op
    modify (Map.update (\_ -> (successful >>= (\_ -> Just value))) key)
    return successful
cacheMapOperation op@(Operation _ (Delete key)) = do
    result <- perform op
    modify (Map.delete key)
    return result


cacheTableOperation :: (Enum k, Ord k, Ord v, t ~ s k v, Show t, Show k, Show v, Typeable2 s, Typeable k, Typeable v,  MonadState (Tables t k v) m, MonadOperation s m) => Operation s t k v r -> m r
cacheTableOperation op@(Operation tableName _) = do
    current <- get
    let currentTables =  tables current
    let tableKey = show tableName
    let table = Map.findWithDefault (Map.empty) tableKey currentTables 
    (result,newState) <- runStateT (cacheMapOperation op) table
    put Tables { tables = Map.insert tableKey newState currentTables }
    return result

cacheTypeOperation :: (Enum k, Ord k, Ord v, t ~ s k v, Show t, Show k, Show v, Typeable2 s, Typeable k, Typeable v, MonadState Types m, MonadOperation s m) => Operation s t k v r -> m r
cacheTypeOperation op = do
    current <- get
    let currentTypes = types current
    let empty = emptyTablesFor op
    let typeKey = typeOf (empty)
    let typeMap = fromDyn (Map.findWithDefault (toDyn empty) typeKey currentTypes) empty
    (result, newState) <- runStateT (cacheTableOperation op) typeMap
    put Types { types = Map.insert typeKey (toDyn  newState) currentTypes }
    return result

-- Cache monad transformer

newtype CacheT (s :: * -> * -> *) m a = CacheT {
    cacheStateT :: StateT Types m a
}

runCacheT :: CacheT s m a -> Types -> m (a, Types)  
runCacheT = runStateT . cacheStateT

instance (Monad m) => Monad (CacheT s m) where
    return = CacheT . return
    (CacheT m) >>= k = CacheT (m >>= \x -> cacheStateT (k x))

instance MonadTrans (CacheT s) where
    lift = CacheT . lift

instance (MonadIO m) => MonadIO (CacheT s m) where
    liftIO = CacheT . liftIO      

instance (Monad m, MonadOperation s m) => MonadOperation s (CacheT s m) where
    perform = CacheT . cacheTypeOperation

-- Logger monad transform

newtype OpperationLoggerT m a = OpperationLoggerT {
    runOpperationLoggerT :: m a
}

instance (Monad m) => Monad (OpperationLoggerT m) where
    return = OpperationLoggerT . return
    (OpperationLoggerT m) >>= k = OpperationLoggerT (m >>= \x -> runOpperationLoggerT (k x))

instance MonadTrans (OpperationLoggerT) where
    lift = OpperationLoggerT

instance (MonadIO m) => MonadIO (OpperationLoggerT m) where
    liftIO = OpperationLoggerT . liftIO    

instance (MonadOperation s m, MonadIO m) => MonadOperation s (OpperationLoggerT m) where
    perform op = do
        liftIO $ putStrLn $ show op
        lift (perform op)      

-- Result logger

newtype ResultLoggerT m a = ResultLoggerT {
    runResultLoggerT :: m a
}

instance (Monad m) => Monad (ResultLoggerT m) where
    return = ResultLoggerT . return
    (ResultLoggerT m) >>= k = ResultLoggerT (m >>= \x -> runResultLoggerT (k x))

instance MonadTrans (ResultLoggerT) where
    lift = ResultLoggerT

instance (MonadIO m) => MonadIO (ResultLoggerT m) where
    liftIO = ResultLoggerT . liftIO    

instance (MonadOperation s m, MonadIO m) => MonadOperation s (ResultLoggerT m) where
    perform op = do
        result <- lift (perform op)
        liftIO $ putStrLn $ "\t" ++ (show result)
        return result

To build this example, you'll need the mtl and containers libraries.

like image 51
Cirdec Avatar answered Sep 28 '22 03:09

Cirdec


In Haskell, you do need to (and want to!) be aware of anything that does IO.

That is one of the strong points about it.

You can use the MonadIO type class to write functions that work in any monad that is allowed to perform IO actions:

myFunctionUsingIO :: (MonadIO m) => ... -> m someReturntype
myFunctionUsingIO = do
  -- some code
  liftIO $ ... -- some IO code
  -- some other code

As many programming interfaces in Haskell are expressed via monads, functions like this might work in more contexts.

You can also use unsafePerformIO to secretly run IO actions from pure code - however this is not advisable in almost all cases. Being pure allows you to immediately see whether side effects are used or not.

IO caching is a side effect, and you are well off if your types reflect that.

like image 23
nh2 Avatar answered Sep 28 '22 04:09

nh2