Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advice defining a data structure in Haskell

Tags:

haskell

I'm having trouble modeling a data structure in Haskell. Suppose I'm running an an animal research facility and I want to keep track of my rats. I want to track the assignment of the rats to cages and to experiments. I also want to keep track of the weight of my rats, the volume of my cages, and keep notes on my experiments.

In SQL, I might do:

create table cages (id integer primary key, volume double);
create table experiments (id integer primary key, notes text)
create table rats (
    weight double,
    cage_id integer references cages (id),
    experiment_id integer references experiments (id)
);

(I realize that this allows me to assign two rats from different experiments to the same cage. That is intended. I don't actually run an animal research facility.)

Two operations that must be possible: (1) given a rat, find the volume of its cage and (2) given a rat, get the notes for the experiment it belongs to.

In SQL, those would be

select cages.volume from rats
  inner join cages on cages.id = rats.cage_id
  where rats.id = ...; -- (1)
select experiments.notes from rats
  inner join experiments on experiments.id = rats.experiment_id
  where rats.id = ...; -- (2)

How might I model this data structure in Haskell?


One way to do it is

type Weight = Double
type Volume = Double

data Rat = Rat Cage Experiment Weight
data Cage = Cage Volume
data Experiment = Experiment String

data ResearchFacility = ResearchFacility [Rat]

ratCageVolume :: Rat -> Volume
ratCageVolume (Rat (Cage volume) _ _) = volume

ratExperimentNotes :: Rat -> String
ratExperimentNotes (Rat _ (Experiment notes) _) = notes

But wouldn't this structure introduce a bunch of copies of the Cages and Experiments? Or should I just not worry about it and hope the optimizer takes care of that?

like image 867
Snowball Avatar asked Aug 06 '12 18:08

Snowball


People also ask

What does => means in Haskell?

On the left hand side of the => you declare constraints for the types that are used on the right. In the example you give, it means that a is constrained to being an instance of both the Ord type class and the Num type class. Follow this answer to receive notifications.

How can you explicitly specify the types of functions in your program in Haskell?

Functions also have a type. It can (and should) be explicitly declared. The type A -> B -> C indicates a function that takes two arguments of type A and B , and returns a C .


2 Answers

Here's a short file I used for testing:

type Weight = Double
type Volume = Double

data Rat = Rat Cage Experiment Weight deriving (Eq, Ord, Show, Read)
data Cage = Cage Volume               deriving (Eq, Ord, Show, Read)
data Experiment = Experiment String   deriving (Eq, Ord, Show, Read)

volume     = 30
name       = "foo"
weight     = 15
cage       = Cage volume
experiment = Experiment name
rat        = Rat cage experiment weight

Then I started ghci and imported System.Vacuum.Cairo, available from the delightful vacuum-cairo package.

*Main System.Vacuum.Cairo> view (rat, Rat (Cage 30) (Experiment "foo") 15)

not-shared

*Main System.Vacuum.Cairo> view (rat, Rat (Cage 30) experiment 15)

shared-experiment

(I'm not really sure why there's doubled-up arrows in this one, but you can ignore/collapse them.)

*Main System.Vacuum.Cairo> view (rat, Rat cage experiment weight)

shared-args

*Main System.Vacuum.Cairo> view (rat, rat)

shared-all

*Main System.Vacuum.Cairo> view (rat, Rat cage experiment (weight+1))

shared-modified

The rule of thumb, as should be illustrated above, is that new objects are created exactly when you call a constructor; otherwise, if you just name an already-created object, no new object is created. This is a safe thing to do in Haskell because it is an immutable language.

like image 178
Daniel Wagner Avatar answered Oct 14 '22 20:10

Daniel Wagner


A more natural Haskell representation of your model would be for the cages to contain the actual rat objects instead of their ids:

data Rat = Rat RatId Weight
data Cage = Cage [Rat] Volume
data Experiment = Experiment [Rat] String

Then you would create ResearchFacility objects using a smart constructor to make sure they follow the rules. It can look something like:

research_facility :: [Rat] -> Map Rat Cage -> Map Rat Experiment -> ResearchFacility
research_facility rats cage_assign experiment_assign = ...

where the cage_assign and experiment_assign are maps which contain the same information as the cage_id and experiment_id foreign keys in sql.

like image 36
Daniel Avatar answered Oct 14 '22 20:10

Daniel