Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you manage an object graph in Haskell?

Tags:

oop

haskell

I'm trying to re-learn systems analysis. I've got a lot of object-oriented thinking for which I'm not able to find equivalents in Haskell, yet.

A fictional system consists of Ambulance Stations, Ambulances and Crew. (It's getting object-y already.) All this state can be wrapped up in a big SystemState type. SystemState [Stations] [Ambulances] [Crew]. I can then create functions which take a SystemState, and return a new SystemState.

module AmbSys
    ( version
    , SystemState
    , Station
    , Ambulance
    , Crew
    ) where

version = "0.0.1"

data SystemState = SystemState [Station] [Ambulance] [Crew] deriving (Show)

data Station = Station { stName :: String
                       , stAmbulances :: [Ambulance]
                       } deriving (Show)

data Ambulance = Ambulance { amCallSign :: String
                           , amStation :: Station
                           , amCrew :: [Crew]
                           } deriving (Show)

data Crew = Crew { crName :: String
                 , crAmbulance :: Ambulance
                 , crOnDuty :: Bool
                 } deriving (Show)

Here's a ghci session where I create some data.

*AmbSys> :load AmbSys                 
[1 of 1] Compiling AmbSys           ( AmbSys.hs, interpreted )
Ok, modules loaded: AmbSys.
*AmbSys> let s = Station "London" []                
*AmbSys> let a = Ambulance "ABC" s []               
*AmbSys> let s' = Station "London" [a]
*AmbSys> let c = Crew "John Smith" a False        
*AmbSys> let a' = Ambulance "ABC" s [c]   
*AmbSys> let s'' = Station "London" [a']             
*AmbSys> let system_state = SystemState [s''] [a'] [c]
*AmbSys> system_state                                 
SystemState [Station {stName = "London", stAmbulances = [Ambulance {amCallSign = "ABC",
 amStation = Station {stName = "London", stAmbulances = []}, amCrew = [Crew 
 {crName = "John Smith", crAmbulance = Ambulance {amCallSign = "ABC", 
 amStation = Station {stName = "London", stAmbulances = []}, amCrew = []}, 
 crOnDuty = False}]}]}] [Ambulance {amCallSign = "ABC", amStation = Station {
 stName = "London", stAmbulances = []}, amCrew = [Crew {crName = "John Smith",
 crAmbulance = Ambulance {amCallSign = "ABC", amStation = Station {stName = "London",
 stAmbulances = []}, amCrew = []}, crOnDuty = False}]}] [Crew {crName = "John Smith",
 crAmbulance = Ambulance {amCallSign = "ABC", amStation = Station {stName = "London",
 stAmbulances = []}, amCrew = []}, crOnDuty = False}]

You can already see a couple of problems here:

  1. I have been unable to create a consistent SystemState - some of the values are 'old' values, such as s or s', rather than s''.
  2. lots of references to the 'same' data have separate copies.

I could now create a function which takes a SystemState and a Crew member's name which returns a new SystemState where that crew member is 'off-duty'.

My problem is that I have to find and change the crew member in the Ambulance and the (identical copy of the) crew member in the SystemState.

This is possible for small systems, but real systems have many more linkages. It looks like a n-squared problem.

I am very aware that I am thinking about the system in an object-oriented way.

How would such a system be correctly created in Haskell?

Edit: Thanks to everyone for your answers, and those on reddit too http://www.reddit.com/r/haskell/comments/b87sc/how_do_you_manage_an_object_graph_in_haskell/

My understanding now seems to be that I can do the things I want in Haskell. On the downside, it seems to be that object/record/struct graphs aren't 'first class' objects in Haskell (as they are in C/Java/etc.), because of the necessary lack of references. There's just a trade off - some tasks are syntactically simpler in Haskell, some are simpler (and more unsafe) in C.

like image 550
fadedbee Avatar asked Mar 02 '10 13:03

fadedbee


7 Answers

Small tip: If you use a recursive let or where (in a .hs file, I don't think it works in ghci) you can at least set up the initial graph more easily as follows:

ambSys = SystemState [s] [a] [c] where
    s = Station "London" [a]
    a = Ambulance "ABC" s [c]
    c = Crew "John Smith" a False

This will get you to the state I think you're trying to reach, but don't try to use the derived Show instances :-) Updating states like these is another can of beans; I'll give that some thought and see what I come up with.

EDIT: I've thought about it some more, and here's what I'd probably do:

I would break the cycles in the object graph by using keys. Something like this would work (I used a similar approach when building real graphs):

import qualified Data.Map as M

version = "0.0.1"

newtype StationKey = StationKey String deriving (Eq,Ord,Show)
newtype AmbulanceKey = AmbulanceKey String deriving (Eq,Ord,Show)
newtype CrewKey = CrewKey String deriving (Eq,Ord,Show)

data SystemState = SystemState (M.Map StationKey Station) (M.Map AmbulanceKey Ambulance) (M.Map CrewKey Crew) deriving (Show)

data Station = Station { stName :: StationKey
                       , stAmbulances :: [AmbulanceKey]
                       } deriving (Show)

data Ambulance = Ambulance { amCallSign :: AmbulanceKey
                           , amStation :: StationKey
                           , amCrew :: [CrewKey]
                           } deriving (Show)

data Crew = Crew { crName :: CrewKey
                 , crAmbulance :: AmbulanceKey
                 , crOnDuty :: Bool
                 } deriving (Show)

ambSys = SystemState (M.fromList [(londonKey, london)]) (M.fromList [(abcKey, abc)]) (M.fromList [(johnSmithKey, johnSmith)]) where
    londonKey = StationKey "London"
    london = Station londonKey [abcKey]
    abcKey = AmbulanceKey "ABC"
    abc = Ambulance abcKey londonKey [johnSmithKey]
    johnSmithKey = CrewKey "John Smith"
    johnSmith = Crew johnSmithKey abcKey False

And then you can start defining your own state-modifying combinators. As you can see, the building of states is more verbose now, but showing works nicely again!

Also I would probably set up a typeclass to make the link between Station and StationKey etcetera types more explicit, if that becomes too cumbersome. I didn't do that in my graph code, as there I had only two key types, which also were distinct, so the newtypes weren't necessary.

like image 168
yatima2975 Avatar answered Nov 20 '22 06:11

yatima2975


It's not getting Object-Oriented-y until u start talking about inheritance and subtype polymorphism. Programs contained data structures called "ambulance" and "station" long before OO was conceived; OO has no monopoly on data abstraction and encapsulation. FP design will also be "domain driven", as will imperative programming.

The problem you're experiencing is how to manage state, and it's an chronic problem in Haskell (actually, in any programming system, see section 3.1.3 of SICP (Abelson and Sussman's Structure and Interpretation of Computer Programs http://mitpress.mit.edu/sicp/ (Don't be put off by the big , academic words, nor the domain name, it's very readable - their example is a bank account).

Your problem is that ur referencing and holding on to old, out of date state. I'd suggest you write functions that take the current state, modify it, and return a new state. Something like:

addStation state station = 
     let (SystemState stations ambs crews) = state
     in SystemState (station:stations) ambs crews)

And if ur using the ghci interpreter, it'll be handy to know about the it variable, which contains the result of the last computation.

You'll eventually end up at the State Monad, but it sounds like that's later....

like image 34
ja. Avatar answered Nov 20 '22 07:11

ja.


One option that others here have given is the ability to use a separate Key type, and look up the values to which you hold possibly circular references up by key in a Map of possible Crew members, Stations or Ambulances.

There is of course a more direct encoding using references, which behaves more like what you are used to:

data Station = Station { stName :: String 
                       , stAmbulances :: [IORef Ambulance] 
                       } deriving (Show) 

data Ambulance = Ambulance { amCallSign :: String 
                           , amStation :: IORef Station 
                           , amCrew :: [IORef Crew] 
                           } deriving (Show) 

data Crew = Crew { crName :: String 
                 , crAmbulance :: IORef Ambulance 
                 , crOnDuty :: Bool 
                 } deriving (Show) 

This results in a heavily side-effectful programming style. In essence you just start writing C/C++ in Haskell, using the IO monad.

There are two Haskell-like approaches to solve this.

One is to tie the knot, and keep the circular references, but then updating becomes problematic.

The other is to kill the circular references:

data Station = Station { stName :: String 
                       , stAmbulances :: [Ambulance] 
                       } deriving (Show) 

data Ambulance = Ambulance { amCallSign :: String 
                           , amCrew :: [Crew] 
                           } deriving (Show) 

data Crew = Crew { crName :: String 
                 , crOnDuty :: Bool 
                 } deriving (Show) 

You can access the crew from the station:

stCrew :: Station -> [Crew]
stCrew = stAmbulances >>= amCrew

Depending on what kinds of access you will require, this may require a rather slow path to access a member of the Crew.

But, An even better encoding might be to eliminate objects from your thinking almost entirely, and embrace the map that you'd be using to lookup the key as part of the structure itself. I apologize for the rough nature of this code, I'm writing it extemporaneously.

import Control.Monad ((>=>))
import Data.Map (Map)
import qualified Data.Map as Map

type Name = String
newtype CrewTraits = CrewTraits { onDuty :: Bool }
type Crew = (Name, CrewTraits) 

type CallSign = String
type AmbulanceTraits = Map Name AssignmentTraits
type Amulance = (CallSign, AmbulanceTraits)

type StationName = String
type StationTraits = Map CallSign AmbulanceTraits
type Station = (StationName,StationTraits)

type Fleet = Map StationName StationTraits

crew :: Name -> Bool -> Crew
crew name isOnDuty = (name, CrewTraits isOnDuty)

ambulance :: CallSign -> [Crew] -> Ambulance
ambulance sign crew = (sign, Map.fromList crew)

station :: StationName -> [Ambulance] -> Station
station name ambulances = (name, Map.fromList ambulances)

fleet :: [Station] -> Fleet
fleet = Map.fromList

Now you can change a station just by using the builtin functionality from Data.Map:

updateStationTraits :: (StationName -> StationTraits -> Maybe StationTraits) ->
                       StationName -> Fleet -> Fleet
updateStationTraits = Map.updateWithKey

which you could make look a little more natural by tupling the Name and StationTraits:

updateStation :: (Station -> Maybe StationTraits) -> 
                 StationName -> Fleet -> Fleet
updateStation = Map.updateWithKey . curry

addAmbulanceToFleet :: Ambulance -> StationName -> Fleet -> Fleet
addAmbulanceToFleet (k,v) = Map.adjust (Map.insert k v)

With all of that you could now unify the notion of a path in this structure with the earlier notion of a key:

type CrewPath = (StationName,CallSign,Name)
type AmbulancePath = (StationName, CallSign)
type StationPath = StationName

lookupCrewTraits :: CrewKey -> Fleet -> Maybe CrewTraits
lookupCrewTraits (s,c,n) = lookup s >=> lookup c >=> lookup n

lookupCrew :: CrewKey -> Fleet -> Maybe Crew
lookupCrew scn@(_,_,n) = (,) n `fmap` lookupCrewTraits scn
like image 21
Edward Kmett Avatar answered Nov 20 '22 05:11

Edward Kmett


Haskell is an excellent choice for modeling the kind of system you are describing.

However, like in any programming language, the way you model your system is heavily dependent on what operations you will want to do on it. And a functional programming language like Haskell helps you focus on that. Modeling the data is nice, but where are the functions?

Your types for Ambulance, Station, and Crew are simple and straightforward. I'm not sure why you then want to glob them together into one big SystemState. That kind of construction is indeed useful in certain situations. It's not surprising that it complicated things for you a little, though, because it is somewhat of an ad hoc mash-up. Whether or not it's needed depends entirely on the kinds of functions you'll be writing.

But the main issue here is how to use GHCi effectively.

What exactly are you trying to do in GHCi? I spend a lot of time at the GHCi prompt. I can divide that time into three categories: exploring functions to understand them better, testing and debugging functions to make sure they work, and performing one-off calculations using functions that I already understand and already know are working. I don't think I've used GHCi very much for just typing in data structures and having GHCi spit them back at me.

Still, for each of those three uses, I do need data structures. Usually the ones I need are simple enough that I can type the entire thing in one go. They actually don't have to be very simple for that - don't forget that you can type multiple mutually-recursive definitions in a single let statement by separating them with ';', and that GHCi supports multi-line statements with the ":{" and ":}' commands.

If a data structure I need is complex enough that I want to build it up incrementally like you were doing, there are several easy ways to do that.

To get a mutable variable that you repeatedly modify to build up your structure, similar to the way you would do it at the command line prompt for an imperative language, look at the Data.IORef module. If you are new to Haskell, I would recommend avoiding Data.IORef like the plague in your programming - it will always tempt you and it is almost always the wrong thing to do. But at the GHCi prompt, it's fine.

Truthfully, I almost never do that. Being lazy, I just use the up-arrow and other command-line editing keys to get the whole thing into one GHCi command incrementally.

And of course, if the data structure you are typing is actually meaningful and not a throw-away example, you'll want to type it into a file in your favorite Haskell editor rather than at the prompt. Then you'll use your editor's GHCi integration, or GHCi's ":r" command, to keep an up-to-date version of your structure available in GHCi.

like image 43
Yitz Avatar answered Nov 20 '22 07:11

Yitz


There are several ways around this. One simple way is to think of your data as an SQL database. That is, your Stations, Ambulances and Crew are all tables with satellite data associated with them. Another option is to define it as a graph database with a graph library.

like image 1
I GIVE CRAP ANSWERS Avatar answered Nov 20 '22 06:11

I GIVE CRAP ANSWERS


I have tried to do this sort of thing too and the conclusion I reached was that Haskell (for me) was probably not the right tool for the job.

Your question 2 gets at it:

lots of references to the 'same' data have separate copies.

Haskell, as a language, is specifically designed to make it difficult to "share instances" or make "separate copies". Because all variables hold immutable values there are no references to objects to compare for identity.

That said, there are some techniques.

One technique is to use mutable objects for your structures. This will force all your code into a monad, however.

You might also look at this paper Type-Safe Observable Sharing which shows how to use some newer language features supporting low-level references in creating a graph. Their example is a digital circuit, but I think it generalizes.

like image 1
Marsh Ray Avatar answered Nov 20 '22 07:11

Marsh Ray


If you really, really need the data to be that recursive, use a proper graph library like fgl.

like image 1
barsoap Avatar answered Nov 20 '22 05:11

barsoap