Let's say I have a list of records, and I want to summarize it by taking the median. More concretely, say I have
data Location = Location { x :: Double, y :: Double }
I have a list of measurements, and I want to summarize it into a median Location
, so something like:
Location (median (map x measurements)) (median (map y measurements))
That is fine, but what if I have something more nested, such as:
data CampusLocation = CampusLocation { firstBuilding :: Location
,secondBuilding :: Location }
I have a list of CampusLocation
s and I want a summary CampusLocation
, where the median is applied recursively to all fields.
What is the cleanest way to do this in Haskell? Lenses? Uniplate?
Edit: Bonus:
What if instead of a record containing fields we want to summarize, we had an implicit list instead? For example:
data ComplexCampus = ComplexCampus { buildings :: [Location] }
How can we summarize a [ComplexCampus]
into a ComplexCampus
, assuming that each of the buildings
is the same length?
Here is an implementation of summarize :: [ComplexCampus] -> ComplexCampus
that uses Lenses w/ Uniplate (as you mentioned) to summarize a list of ComplexCampus a single ComplexCampus.
{-# Language TemplateHaskell,DeriveDataTypeable #-}
import Control.Lens
import Data.Data.Lens
import Data.Typeable
import Data.Data
import Data.List(transpose,genericLength)
data Location = Location { _x :: Double, _y :: Double } deriving(Show,Typeable,Data)
data CampusLocation = CampusLocation { _firstBuilding :: Location, _firsecondBuilding :: Location }deriving(Show,Typeable,Data)
data ComplexCampus = ComplexCampus { _buildings :: [Location] } deriving(Show,Typeable,Data)
makeLenses ''Location
makeLenses ''CampusLocation
makeLenses ''ComplexCampus
l1 = Location 1 10
l2 = Location 2 20
l3 = Location 3 30
c1 = CampusLocation l1 l2
c2 = CampusLocation l2 l3
c3 = CampusLocation l1 l3
campusLocs = [c1,c2,c3]
c1' = ComplexCampus [l1, l2]
c2' = ComplexCampus [l2, l3]
c3' = ComplexCampus [l1, l3]
campusLocs' = [c1',c2',c3']
average l = (sum l) / (genericLength l)
-- returns average location for a list of locations
averageLoc locs = Location {
_x = average $ locs ^.. biplate . x,
_y = average $ locs ^.. biplate . y
}
summarize :: [ComplexCampus] -> ComplexCampus
summarize ccs = ComplexCampus $ ccs ^.. biplate . buildings ^.. folding transpose . to averageLoc
Using biplate here is likely overkill, but regardless in averageLoc
we use biplate
on the list of locations to get all x
fields and all y
fields. If you wanted to summarize a ComplexCampus
into a single Location
we could use biplate
to extract all x
values and all y
values from the top level ComplexBuilding
.
For example:
campusLocs' ^.. biplate . x
gives us all x values andcampusLocs' ^.. biplate . y
gives us all y values
Likewise, to get all locations, we could just do:
(campusLocs' ^.. biplate) ::[Location]
Or, if we wanted every Double:
(campusLocs' ^.. biplate) ::[Double]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With