I'm writing a function in Haskell that takes in two integers a and b and computes the length of the largest subset(s) of [a,b] such that all the elements are mutually coprime. Now, the reason for this is because I believe investigating such values might yield means of producing trivial lesser bounds (possibly one's large enough to imply actual primes in various per conjectured ranges such as consecutive squares). So naturally I'm trying to run this for some pretty large numbers.
Unfortunately the below code is not running fast enough for it to be even practical for me to use. I think the flaw is Haskell's laziness policy causing issues, but I neither know the syntax nor the place where I need to force execution to prevent the operations from accumulating.
subsets [] = [[]]
subsets (x:xs) = subsets xs ++ map (x:) (subsets xs)
divides a b = (mod a b == 0)
coprime a b = ((gcd a b) == 1)
mutually_coprime [] = True
mutually_coprime (x:xs) | (coprime_list x xs) = mutually_coprime xs
| otherwise = False
coprime_list _ [] = True
coprime_list a (x:xs) | (coprime a x) = coprime_list a xs
| otherwise = False
coprime_subsets a b = coprime_subsets_helper (subsets [a..b])
coprime_subsets_helper [] = []
coprime_subsets_helper (x:xs) | (mutually_coprime x) = [x] ++ (coprime_subsets_helper xs)
| otherwise = coprime_subsets_helper xs
coprime_subset_length a b = max_element (map list_length (coprime_subsets a b))
list_length [] = 0
list_length (x:xs) = 1 + list_length xs
max_element a = max_element_helper a 0
max_element_helper [] a = a
max_element_helper (x:xs) a | (x > a) = max_element_helper xs x
| otherwise = max_element_helper xs a
Just to make it clear what sort of input this is hanging upon, "coprime_subsets 100 120" never seems to even halt for me. I actually left it running, got up, did some other stuff and came back to it later. It was still running. I suspect that a large bottleneck is computing all the subsets immediately. However, I don't want to put an artificial lower bound on the subsets generated. That might sieve out all of the coprime sets leaving me with nothing.
What I have tried so far:
I replaced the original coprime function I had with gcd. Originally this used modulo and iterative checking for all integers. I presume gcd used something like Euclid's Algorithm which should run faster, in theory.
I've been trying to think of a way to build the generation of the subsets into the coprime set generator. So far I haven't been able to come up with anything. I'm also not sure if that would actually help with anything.
I've been looking for anywhere that Haskell's laziness policy might be hurting me. Nothing stands out, but I'm sure.
I'm also aware that this could be an efficiency issue with the environment I use (winhugs). I'd say that is the issue; however, when I asked how to determine the largest subset (for general n sized ranges) on Math Stack Exchange the answers I received indicated that computationally speaking this is a very slow thing to calculate. If so, that's fine. I was just hoping to maybe be able to run through some of the ranges I was interested in without it taking almost forever.
I know efficiency is generally not allowed on this site; however, I have tried quite a bit and I'm not just trying to be lazy here. I know Haskell has quite a few weird quirks that can force it into being deceptively inefficient. It's been a while since I've coded in it, and I suspect I've landed in one of those quirks.
When working with numbers, one should use the fastest representation that provides the required accuracy and range. Your code doesn't have top-level type signatures, so GHC infers the type signature for coprime_subsets
as
coprime_subsets :: Integral a => a -> a -> [[a]]
This allows GHC to choose an Integral
for you and it will happily choose Integer
, which is much slower than Int
. With Integer
, the program spends a huge amount of time just computing gcds. Forcing GHC to use Int
takes the run time down from 6 seconds to 1 second, since GHC can do integer math on machine integers directly.
NB: It is also a good practice to always provide top-level type signatures. Even when the compiler doesn't benefit from them, humans often do.
Now, on to the meat of the problem. Running
main = print . length $ coprime_subsets (100 :: Int) 120
main :: IO ()
with profiling enabled (stack build --profile
for Stack) and passing +RTS -p -h
to the executable (-p
for time and -h
for space) gives us a breakdown:
COST CENTRE MODULE SRC %time %alloc
subsets Coprime src/Coprime.hs:(4,1)-(5,52) 52.5 100.0
coprime Coprime src/Coprime.hs:11:1-26 25.5 0.0
coprime_list Coprime src/Coprime.hs:(19,1)-(21,41) 18.5 0.0
coprime_subsets_helper Coprime src/Coprime.hs:(27,1)-(29,69) 1.8 0.0
mutually_coprime Coprime src/Coprime.hs:(14,1)-(16,43) 1.7 0.0
When we were using Integer
, the vast majority (~78%) of the time was spent in the coprime
test. The majority is now spent constructing the powerset, so let's look there first.
There are generally three strategies to speed something up:
Which of these might apply to subsets
? (1) is an obvious choice. Constructing the powerset is O(2^n)
, so any asymptotic improvements here would be very helpful indeed. Can we find any? Theoretically, we should be able to. As Daniel suggests, this problem is equivalent to the maximal cliques problem, which is also exponential. However, maximal cliques has a solution with a smaller base, which means we should be able to find an asymptotic improvement for this problem as well.
The key insight to reducing its asymptotic complexity (and also the number of times we call it) is that we are generating the vast majority of subsets just to throw them away later on when we check them for coprimality. If we can avoid generating bad subsets initially, we will generate fewer subsets and perform fewer checks for coprimality. If the number of pruned results is proportional to the size of the entire computation tree, this will yield an asymptotic improvement. This sort of pruning is common in functional algorithm optimization; you can see an instructive example in Richard Bird's sudoku solver. In fact, if we can write a function that only generates non-mutually-coprime subsets, we will have solved the entire problem!
We can do this by modifying the original generator subsets
to filter out non-coprime terms as it goes:
coprimeSubsets [] = [[]]
coprimeSubsets (x:xs)
= coprimeSubsets xs ++ map (x :) (coprimeSubsets (filter (coprime x) xs))
(There is probably some clever fold we can use here if we think about it really hard but the explicit recursion is fine too.)
Once we do this, we can find coprime subsets of [100..120]
in ~0.1 seconds, an order of magnitude improvement. The cost centre report tells the story:
COST CENTRE MODULE SRC %time %alloc
MAIN MAIN <built-in> 42.9 0.5
coprimeSubsets Coprime src/Coprime.hs:(33,1)-(35,75) 28.6 67.4
CAF GHC.IO.Encoding <entire-module> 14.3 0.1
coprime Coprime src/Coprime.hs:13:1-26 14.3 31.1
Now we actually spend most of our time doing IO and such. Furthermore, the number of calls to our coprime
test is now only 3,848, an improvement of several orders of magnitude. We can also find the coprime subsets of [100..150]
in 3 seconds, compared to... well, I didn't wait for it to finish, but at least a few minutes.
The next place to look for speedups might be in memoizing the coprime
function, since this problem involves computing it for the same arguments many times.
I suggest two main changes:
gcd
just once for each pair of numbers, rather than repeatedly.(Neither suggestion is related to laziness at all; I think that guess of yours was a red herring.) Below I implement these two suggestions; it turned out quite succinct, I think:
coprime_subsets' [] = [[]]
coprime_subsets' (x:xs)
= coprime_subsets' xs
++ map (x:) (coprime_subsets' (filter ((1==) . gcd x) xs))
We can check that this computes the same answers in ghci:
> coprime_subsets 10 20 == coprime_subsets' [10..20]
True
> coprime_subsets 100 120 == coprime_subsets' [100..120]
True
Apparently my computer is much faster than yours: coprime_subsets 100 120
completes in just under 16 seconds here even in ghci. Of course, my version takes 0.02 seconds even in ghci, so... =)
If you only care about maximal coprime subsets, you can make it faster still. The main change here is the addition of the filter
in front of the recursion:
maximal_coprime_subsets [] = [[]]
maximal_coprime_subsets (x:xs)
= filter (any ((>1) . gcd x)) (maximal_coprime_subsets xs)
++ map (x:) (maximal_coprime_subsets (filter ((1==) . gcd x) xs))
Even though this one repeats gcd
s much more, it can do [100..120]
in around 0.01 seconds even in ghci, so it's a fair factor faster than the previous implementation.
The difference in timing here appears to be that the output coprime_subsets'
just took longer to print, ha!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With