Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summarize a list of Haskell records

Tags:

haskell

lenses

Let's say I have a list of records, and I want to summarize it by taking the median. More concretely, say I have

data Location = Location { x :: Double, y :: Double }

I have a list of measurements, and I want to summarize it into a median Location, so something like:

Location (median (map x measurements)) (median (map y measurements))

That is fine, but what if I have something more nested, such as:

data CampusLocation = CampusLocation { firstBuilding :: Location
                                      ,secondBuilding :: Location }

I have a list of CampusLocations and I want a summary CampusLocation, where the median is applied recursively to all fields.

What is the cleanest way to do this in Haskell? Lenses? Uniplate?

Edit: Bonus:

What if instead of a record containing fields we want to summarize, we had an implicit list instead? For example:

data ComplexCampus = ComplexCampus { buildings :: [Location] }

How can we summarize a [ComplexCampus] into a ComplexCampus, assuming that each of the buildings is the same length?

like image 572
yong Avatar asked Aug 28 '14 03:08

yong


1 Answers

Here is an implementation of summarize :: [ComplexCampus] -> ComplexCampus that uses Lenses w/ Uniplate (as you mentioned) to summarize a list of ComplexCampus a single ComplexCampus.

{-# Language TemplateHaskell,DeriveDataTypeable #-}
import Control.Lens
import Data.Data.Lens
import Data.Typeable
import Data.Data
import Data.List(transpose,genericLength)
data Location = Location { _x :: Double, _y :: Double } deriving(Show,Typeable,Data)


data CampusLocation =  CampusLocation { _firstBuilding :: Location, _firsecondBuilding :: Location }deriving(Show,Typeable,Data)
data ComplexCampus = ComplexCampus { _buildings :: [Location] } deriving(Show,Typeable,Data)


makeLenses ''Location
makeLenses ''CampusLocation
makeLenses ''ComplexCampus

l1 = Location 1 10
l2 = Location 2 20
l3 = Location 3 30


c1 = CampusLocation l1 l2
c2 = CampusLocation l2 l3
c3 = CampusLocation l1 l3
campusLocs = [c1,c2,c3]


c1' = ComplexCampus [l1, l2]
c2' = ComplexCampus [l2, l3]
c3' = ComplexCampus [l1, l3]
campusLocs' = [c1',c2',c3']


average l = (sum l) / (genericLength l)

-- returns average location for a list of locations
averageLoc locs = Location {
             _x = average $ locs ^.. biplate . x,
             _y = average $ locs ^.. biplate . y
             }


summarize :: [ComplexCampus] -> ComplexCampus
summarize ccs = ComplexCampus $ ccs ^.. biplate . buildings ^.. folding transpose . to averageLoc

Using biplate here is likely overkill, but regardless in averageLoc we use biplate on the list of locations to get all x fields and all y fields. If you wanted to summarize a ComplexCampus into a single Location we could use biplate to extract all x values and all y values from the top level ComplexBuilding.

For example:

campusLocs' ^.. biplate . x gives us all x values andcampusLocs' ^.. biplate . y gives us all y values

Likewise, to get all locations, we could just do:

(campusLocs' ^.. biplate) ::[Location]

Or, if we wanted every Double: (campusLocs' ^.. biplate) ::[Double]

like image 153
jek Avatar answered Nov 10 '22 05:11

jek