Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex data structures in Haskell

Tags:

haskell

After reading http://learnyouahaskell.com/ I still don't understand, how complex data structures are built in Haskell.

One example:

I have many locations, each location is able to hold exactly one item. Each item is able to be positioned at exactly one location. Locations und items have both a name and some additional informations (I leave them out here). Each location and each item is unique by its name.

This problem is in the context of an optimization problem, where a list of short transports should be built. This means, that the objects are just read once and then they get never manipulated. There exist more types, which are all linked together by instance fields in an object-oriented design.

My first thought (from my knowledge of object-oriented design) would be:

data Location = Location String Item
data Item = Item String Location

Since there are no references, only values in Haskell, I expect that to be an endless recursion.

data Location = Location String Item
data Item = Item String

With this approach I can't get the location, when I only have the item.

My second thought is:

data Location = Location String
data Item = Item String
type LocItems = [(Location, Item)]

Well, this way I have to give methods, which need the item of a location, two parameters: The location und the map. When it's getting more complex, i would end up with many maps. Second reason against this approach: The type system does not restrict the locations and items to be linked to only one of the other. The map (or better associative list) can map one location to many items.

I don't understand how such complex structures are built in Haskell.

So how is it possible to build such a data structure in Haskell?

like image 919
F. Böller Avatar asked Aug 06 '14 12:08

F. Böller


4 Answers

Since there are no references, only values in Haskell, I expect that to be an endless recursion.

Still, it's possible:

data Location = Location String Item
data Item     = Item     String Location

locationName (Location s _) = s
getItem      (Location _ i) = i
itemName     (Item s _) = s
getLocation  (Item _ l) = l

getItemNameAtLocation :: Location -> String
getItemNameAtLocation = itemName . getItem

getLocationNameOfItem :: Item -> String
getLocationNameOfItem = locationName . getLocation

mkItemLocation :: ItemName -> LocationName -> (Item, Location)
mkItemLocation i l = let it = Item i $ Location l $ it in (it, getLocation it)

main = do
        let it   = Item "Toothbrush" $ Location "Bathroom" $ it
            loc1 = getLocation it
            loc2 = Location "Quantum bathroom" $ it
        print $ getLocationNameOfItem it
        print $ getItemNameAtLocation loc1
        print $ getItemNameAtLocation loc2
        print $ locationName loc2

However, this doesn't enforce your rules, as there are now two locations that claim to own the toothbrush. If you don't export the constructors, you can still enforce this:

module ItemLocation (mkItemLocation, Item, Location,
                     getLocation, locationName,
                     getItem, itemName) where

-- see above for Item, Location and others

type ItemName     = String
type LocationName = String

mkItemLocation :: ItemName -> LocationName -> (Item, Location)
mkItemLocation i l = let it = Item i $ Location l $ it in (it, getLocation it)
main = do
    let (it, loc) = mkItemLocation "Toothbrush" "Bathroom"
        print $ getLocationNameOfItem it
        print $ getItemNameAtLocation loc

Still, nothing is preventing you from using mkItemLocation "Toothbrush" "Another quantum room". But at this point, you haven't said how you would identify single items or locations (probably via name).

Note that you probably want to use data Location = Location String (Maybe Item). That being said, it's not really clear how you want to manipulate a location or an item, and how those manipulations should reflect on the rest of your locations. Depending on what you actually want to do, you might end up using State together with two Map.


Ok, that above just shows you how you could deal with recursive data types. How would one actually approach your problem? Lets try to build an interface:

data Magic

-- | initial empty magic
empty :: Magic

-- | turns the magic type into a list of (Location, Item)
--   every Location and Item is unique
assoc  :: Magic -> [(Location, Item)]


-- | adds the given Location and Item and puts them into relation
--   If either Location or Item already exist, they're going to be
--  removed (together with their counterpart) beforehand
insert :: Location -> Item -> Magic -> Magic

Now, this can be generalized. Instead of Location and Item, we can support a and b. We gain:

module DualMap (DualMap, empty, assocLeft, 
                assocRight, flipMap, insert, 
                removeLeft, removeRight) where

import Data.Map (Map)
import qualified Data.Map as M

data DualMap a b = DualMap (Map a b) (Map b a) deriving (Eq, Show)

empty :: DualMap a b
empty = DualMap (M.empty) (M.empty)

flipMap :: DualMap a b -> DualMap b a
flipMap (DualMap ls rs) = DualMap rs ls

assocLeft :: DualMap a b -> [(a, b)]
assocLeft (DualMap ls _) = M.toList ls

assocRight :: DualMap a b -> [(b, a)]
assocRight = assocLeft . flipMap

insert :: (Ord a, Ord b) => a -> b -> DualMap a b -> DualMap a b
insert loc item m = DualMap (M.insert loc item ls) (M.insert item loc is)
  where (DualMap ls is) = removeLeft loc m

removeLeft :: (Ord a, Ord b) => a -> DualMap a b -> DualMap a b
removeLeft l m@(DualMap ls rs) = 
  case M.lookup l ls of
    Just r  -> DualMap (M.delete l ls) (M.delete r rs)
    Nothing -> m

removeRight :: (Ord a, Ord b) => b -> DualMap a b -> DualMap a b
removeRight r m@(DualMap ls rs) = 
  case M.lookup r rs of
    Just l  -> DualMap (M.delete l ls) (M.delete r rs)
    Nothing -> m

Note that you shouldn't export DataMap's constructor. removeRight and removeLeft will ensure that if you take out a left value, the right value will also be deleted. Note that in our case using one of them is sufficient, as insert stores both values symmetrically.

This requires you to have valid Ord instances for both Location and Item, which should be based on their unique attribute (in this case their name). If you already happen to have a Ord or Eq instance, which doesn't use only the name, use a newtype wrapper with the appropriate instance.

like image 143
Zeta Avatar answered Oct 24 '22 02:10

Zeta


have many locations, each location is able to hold exactly one item. Each item is able to be positioned at exactly one location. Locations und items have both a name and some additional informations (I leave them out here).

So, items can only belong in one location, and a location holds exactly one item. so basically, you have a one to one relationship.

So, if a location has five attributes, and an items has three, your combined location-item has eight items.

But I suspect that the data structure you really want is not as trivial as what you described... with more input from you I might be able to provide a better answer.

like image 26
thecoshman Avatar answered Oct 24 '22 03:10

thecoshman


I'd say your LocItems approach is along the right lines. If I understand you right, you want:

  1. to be able to get the Item at a given Location (if there is one)
  2. to get the Location of an Item.
  3. Only one Item at a Location and one Location for an Item.

Well your requirement 1 maps to the following type signature:

Location -> Maybe Item

So that's a function or a Map. Or a function made out of a Map, like this:

type ItemLocations = Map Location Item
lookupItem :: ItemsByLocation -> Location -> Maybe Item

Having the extra parameter isn't really a problem and there are several ways to make it go away. For example, if you have the item locations in a map called itemsByLocation, then partially applying lookupItem gives you the function you want.

let lookupItemInMyMap = lookupItem itemsByLocation

For your second requirement, getting the Location from an Item, I'd probably use another Map. The Maps enforce your third requirement.

There's one thing missing though: identity. In OOP, objects have identity (their memory address), but Haskell values don't. You will want to key the map from Item to Location by an Item identifier, not the Item itself. So your second map has the following type:

type ItemID = String
type LocationsByItem = Map ItemID Location
like image 2
GarethR Avatar answered Oct 24 '22 03:10

GarethR


In Haskell, all objects are immutable, so you cannot distinguish a reference to an object from a copy of the object (and there is no need to).

In (imperative) OO, you want references because you want to share something that is mutable.

You need to unlearn this sharing "optimisation".

There are ways to simulate in Haskell, but mostly it's not needed, and there's a perfectly clear immutable solution.

E.g., "a location holds one item" - that's modelled by Data.Map.Map Location Item. But if your location later holds another item, then you need a another map.

like image 1
d8d0d65b3f7cf42 Avatar answered Oct 24 '22 02:10

d8d0d65b3f7cf42