Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell: converting a list of (a, b) key-value pairs (with possibly repeated keys) to a list of (a, [b]) grouped by key

I'm a Haskell beginner. Let's suppose I want to write a function convertKVList that takes a flat list of key-value pairs, where some of the keys might be repeated, and turns it into a mapping from keys to lists of values where all of the keys are unique. For instance, on a list of pairs of Ints, I want this behavior:

> convertKVList [(1, 2), (1, 4), (1, 3), (2, 3)]
[(1,[3,4,2]),(2,[3])]

This seems like a common enough task that there ought to be a library function available to do what I want, but I couldn't find anything when I looked. Finally, someone suggested that I compose Map.toList with Map.fromListWith (++), and I ended up with this:

import Data.Map as Map (toList, fromListWith)

convertKVList :: (Ord a) => [(a, b)] -> [(a, [b])]
convertKVList ls =
  (Map.toList . Map.fromListWith (++) . map (\(x,y) -> (x,[y]))) ls

My question is for more experienced Haskellers and is in two parts: First, is this how you would go about it, or is there a "better" (easier to read, or more efficient, or both) way?

Second, how could I have come up with this on my own? I knew that I wanted the type to be [(a, b)] -> [(a, [b])], but putting that into Hoogle didn't turn up anything useful. And I had looked at the Data.Map docs, but neither fromListWith nor toList had jumped out as particularly helpful. So: how would you go about thinking about this problem? (I realize that both of these questions are subjective, especially the second one.)

Thanks!

like image 730
Lindsey Kuper Avatar asked Mar 20 '13 02:03

Lindsey Kuper


2 Answers

One of the most important points when writing a function, is trying to split what it should do into separate sub-tasks (which are often put together by function composition in the end). E.g., in the definition you came up with, there are three tasks (in order of application, i.e., from right to left in the definition):

  1. map the 2nd component of each pair to a singleton list (thereby enabling the use of Map.fromListWith)
  2. create a map (which takes care of merging entries with equal keys)
  3. turn it into a list

I wanted to post a different solution (which was an exact copy of the code Mark posted in the meanwhile ;)). Just to make clear that most of the time there are different routes to the same goal. In his definition you have the separate tasks:

  1. sort the list by keys
  2. group the result by keys
  3. turn it into a list of the desired type

Once again, separation of concerns (modularity) is an important principle. Just try to apply it to small problems and once you gained some experience you will be able to come up with surprisingly simple solutions to seemingly difficult problems.

like image 71
chris Avatar answered Sep 21 '22 22:09

chris


while this is by no means canonical:

import Data.List
import Data.Ord
import Data.Function (on)

convertKVList :: Ord a => [(a,b)] -> [(a,[b])]
convertKVList = map (\x -> (fst $ head x,  map snd x)) . groupBy ((==) `on` fst) . sortBy (comparing fst)

it does have the advantage of not pulling in Data.Map. should be asymptotically the same, haven't benchmarked. I think you could clean up the first chunk with Control.Arrow (something like (fst . head &&& map snd)) but it's not obviously cleaner.

Not sure how you'd come to it except by either knowing it or asking in #haskell, though.

like image 43
Mark Wotton Avatar answered Sep 22 '22 22:09

Mark Wotton