Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating Cartesian products in Haskell

I am trying to generate all possible combinations of n numbers. For example if n = 3 I would want the following combinations:

(0,0,0), (0,0,1), (0,0,2)... (0,0,9), (0,1,0)... (9,9,9).

This post describes how to do so for n = 3:

[(a,b,c) | m <- [0..9], a <- [0..m], b <- [0..m], c <- [0..m] ]

Or to avoid duplicates (i.e. multiple copies of the same n-uple):

let l = 9; in [(a,b,c) | m <- [0..3*l],
                         a <- [0..l], b <- [0..l], c <- [0..l],
                         a + b + c == m ]

However following the same pattern would become very silly very quickly for n > 3. Say I wanted to find all of the combinations: (a, b, c, d, e, f, g, h, i, j), etc.

Can anyone point me in the right direction here? Ideally I'd rather not use a built in funtion as I am trying to learn Haskell and I would rather take the time to understand a peice of code than just use a package written by someone else. A tuple is not required, a list would also work.

like image 837
James Allingham Avatar asked Jan 29 '16 12:01

James Allingham


1 Answers

My other answer gave an arithmetic algorithm to enumerate all the combinations of digits. Here's an alternative solution which arises by generalising your example. It works for non-numbers, too, because it only uses the structure of lists.

First off, let's remind ourselves of how you might use a list comprehension for three-digit combinations.

threeDigitCombinations = [[x, y, z] | x <- [0..9], y <- [0..9], z <- [0..9]]

What's going on here? The list comprehension corresponds to nested loops. z counts from 0 to 9, then y goes up to 1 and z starts counting from 0 again. x ticks the slowest. As you note, the shape of the list comprehension changes (albeit in a uniform way) when you want a different number of digits. We're going to exploit that uniformity.

twoDigitCombinations = [[x, y] | x <- [0..9], y <- [0..9]]

We want to abstract over the number of variables in the list comprehension (equivalently, the nested-ness of the loop). Let's start playing around with it. First, I'm going to rewrite these list comprehensions as their equivalent monad comprehensions.

threeDigitCombinations = do
    x <- [0..9]
    y <- [0..9]
    z <- [0..9]
    return [x, y, z]
twoDigitCombinations = do
    x <- [0..9]
    y <- [0..9]
    return [x, y]

Interesting. It looks like threeDigitCombinations is roughly the same monadic action as twoDigitCombinations, but with an extra statement. Rewriting again...

zeroDigitCombinations = [[]]  -- equivalently, `return []`
oneDigitCombinations = do
    z <- [0..9]
    empty <- zeroDigitCombinations
    return (z : empty)
twoDigitCombinations = do
    y <- [0..9]
    z <- oneDigitCombinations
    return (y : z)
threeDigitCombinations = do
    x <- [0..9]
    yz <- twoDigitCombinations
    return (x : yz)

It should be clear now what we need to parameterise:

combinationsOfDigits 0 = return []
combinationsOfDigits n = do
    x <- [0..9]
    xs <- combinationsOfDigits (n - 1)
    return (x : xs)

ghci> combinationsOfDigits' 2
[[0,0],[0,1],[0,2],[0,3],[0,4],[0,5],[0,6],[0,7],[0,8],[0,9],[1,0],[1,1] ... [9,8],[9,9]]

It works, but we're not done yet. I want to show you that this is an instance of a more general monadic pattern. First I'm going to change the implementation of combinationsOfDigits so that it folds up a list of constants.

combinationsOfDigits n = foldUpList $ replicate n [0..9]
    where foldUpList [] = return []
          foldUpList (xs : xss) = do
              x <- xs
              ys <- foldUpList xss
              return (x : ys)

Looking at the definiton of foldUpList :: [[a]] -> [[a]], we can see that it doesn't actually require the use of lists per se: it only uses the monad-y parts of lists. It could work on any monad, and indeed it does! It's in the standard library, and it's called sequence :: Monad m => [m a] -> m [a]. If you're confused by that, replace m with [] and you should see that those types mean the same thing.

combinationsOfDigits n = sequence $ replicate n [0..9]

Finally, noting that sequence . replicate n is the definition of replicateM, we get it down to a very snappy one-liner.

combinationsOfDigits n = replicateM n [0..9]

To summarise, replicateM n gives the n-ary combinations of an input list. This works for any list, not just a list of numbers. Indeed, it works for any monad - though the "combinations" interpretation only makes sense when your monad represents choice.

This code is very terse indeed! So much so that I think it's not entirely obvious how it works, unlike the arithmetic version I showed you in my other answer. The list monad has always been one of the monads I find less intuitive, at least when you're using higher-order monad combinators and not do-notation.

On the other hand, it runs quite a lot faster than the number-crunching version. On my (high-spec) MacBook Pro, compiled with -O2, this version calculates the 5-digit combinations about 4 times faster than the version which crunches numbers. (If anyone can explain the reason for this I'm listening!)

benchmark

like image 60
Benjamin Hodgson Avatar answered Oct 24 '22 02:10

Benjamin Hodgson