Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group and count in haskell?

Tags:

haskell

Given a list (for instance [1,2,2,3,3,4,5,6]) how is it possible to group and count them according to bins/range? I would like to be able to specify a range, so that:

Say range=2, and using the previous list, would give me [1, 4, 2, 1], given that there's 1 0's or 1's, 4 2's or 3's, 2 4's or 5's and 1 6's or 7's.

Say range=4, and using the previous list, would give me [5, 3], given that there's 5 0's or 1's or 2's or 3's, 3 4's or 5's or 6's or 7's.

I have looked into group and groupBy but not found appropriate predicates, and also the histogram-fill library. The latter seems very nice to create bins, but I could not find out how to load data into those bins.

How can I achieve this?

My attempt on one of the suggestions below:

import Data.List 
import Data.Function 

quantize range n = n `div` range  

main = print (groupBy ((==) `on` quantize 4) [1,2,3,4,2]) 

The output is [[1,2,3],[4],[2]] when it should have been [[1,2,2,3],[4]]. Both suggestions below works on sorted lists.

main = print (groupBy ((==) `on` quantize 4) (sort [1,2,3,4,2]))   
like image 715
stian Avatar asked Mar 15 '23 22:03

stian


2 Answers

You can achieve this using the groupBy and div functions. Let's say we have a range N. If we get the integral division (div) of N consecutive numbers, all of those should be equal. For example, N=3, 0 div 3 = 0, 1 div 3 = 0, 2 div 3 = 0, 3 div 3 = 1, 4 div 3 = 1, 5 div 3 = 1, 6 div 3 = 2.

Knowing this, we can look at groupBy :: (a -> a -> Bool) -> [a] -> [[a]] and use the function:

sameGroup :: Integral a => a -> a -> a -> Bool
sameGroup range a b = a `div` range == b `div` range

To write our own grouping function

groupings :: Integral a => a -> [a] -> [[a]]
groupings range = groupBy (sameGroup range)

Which should look something like groupings 2 [1, 2, 2, 3, 3, 4, 5, 6] == [[1], [2, 2, 3, 3], [4, 5], [6]]. Now we just have to count it to have the final function

groupAndCount :: Integral a => a -> [a] -> [Int]
groupAndCount range list = map length $ groupings range list

Which should mirror the wanted behavior.

like image 50
Mephy Avatar answered Mar 25 '23 01:03

Mephy


You'll need to quantize in order to get the definitions of the bins.

-- `quantize range n` rounds n down to the nearest multiple of range
quantize :: Int -> Int -> Int

groupBy takes a "predicate" argument*, which identifies whether two items should be placed in the same bin. So:

groupBy (\n m -> quantize range n == quantize range m) :: [Int] -> [[Int]]

will group elements by whether they are in the same bin, without changing the elements. If range is 2, that will give you something like

[[1],[2,2,3,3],[4,5],[6]]

Then you just have to take the length of each sublist.

* There's a neat function called on which allows you to write the predicate more succinctly

groupBy ((==) `on` quantize range)
like image 37
luqui Avatar answered Mar 25 '23 02:03

luqui