Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group similar items in a list using Haskell?

Tags:

haskell

Given a list of tuples like this:

dic = [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] 

How to group items of dic resulting in a list grp where,

grp  = [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])] 

I'm actually a newcomer to Haskell...and seems to be falling in love with it..
Using group or groupBy in Data.List will only group similar adjacent items in a list. I wrote an inefficient function for this, but it results in memory failures as I need to process a very large coded string list. Hope you would help me find a more efficient way.

like image 754
td123 Avatar asked Sep 13 '12 02:09

td123


People also ask

Can Haskell lists have different types?

Haskell also incorporates polymorphic types---types that are universally quantified in some way over all types. Polymorphic type expressions essentially describe families of types. For example, (forall a)[a] is the family of types consisting of, for every type a, the type of lists of a.

How do you use sortBy in Haskell?

The sortBy function is the non-overloaded version of sort. >>> sortBy (\(a,_) (b,_) -> compare a b) [(2, "world"), (4, "!"), (1, "Hello")] [(1,"Hello"),(2,"world"),(4,"!")] sortBy sorts the specified Seq according to the specified comparator.

What does group do in Haskell?

The group function takes a stream and returns a list of streams such that flattening the resulting list is equal to the argument. Moreover, each stream in the resulting list contains only equal elements.


Video Answer


2 Answers

Whenever possible, reuse library code.

import Data.Map sortAndGroup assocs = fromListWith (++) [(k, [v]) | (k, v) <- assocs] 

Try it out in ghci:

*Main> sortAndGroup [(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")] fromList [(1,["bb","cc","aa"]),(2,["aa"]),(3,["gg","ff"])] 

EDIT In the comments, some folks are worried about whether (++) or flip (++) is the right choice. The documentation doesn't say which way things get associated; you can find out by experimenting, or you can sidestep the whole issue using difference lists:

sortAndGroup assocs = ($[]) <$> fromListWith (.) [(k, (v:)) | (k, v) <- assocs] -- OR sortAndGroup = fmap ($[]) . M.fromListWith (.) . map (fmap (:)) 

These alternatives are about the same length as the original, but they're a bit less readable to me.

like image 140
Daniel Wagner Avatar answered Sep 17 '22 06:09

Daniel Wagner


Here's my solution:

import Data.Function (on) import Data.List (sortBy, groupBy) import Data.Ord (comparing)  myGroup :: (Eq a, Ord a) => [(a, b)] -> [(a, [b])] myGroup = map (\l -> (fst . head $ l, map snd l)) . groupBy ((==) `on` fst)           . sortBy (comparing fst) 

This works by first sorting the list with sortBy:

[(1,"aa"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg"),(1,"bb")]      => [(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")] 

then grouping the list elements by the associated key with groupBy:

[(1,"aa"),(1,"bb"),(1,"cc"),(2,"aa"),(3,"ff"),(3,"gg")]  => [[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]] 

and then transforming the grouped items to tuples with map:

[[(1,"aa"),(1,"bb"),(1,"cc")],[(2,"aa")],[(3,"ff"),(3,"gg")]]  => [(1,["aa","bb","cc"]), (2, ["aa"]), (3, ["ff","gg"])]`) 

Testing:

> myGroup dic [(1,["aa","bb","cc"]),(2,["aa"]),(3,["ff","gg"])] 
like image 41
Mikhail Glushenkov Avatar answered Sep 21 '22 06:09

Mikhail Glushenkov