Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to optimize this algorithm/code to compute the largest subset of mutually coprime integers between a and b

I'm writing a function in Haskell that takes in two integers a and b and computes the length of the largest subset(s) of [a,b] such that all the elements are mutually coprime. Now, the reason for this is because I believe investigating such values might yield means of producing trivial lesser bounds (possibly one's large enough to imply actual primes in various per conjectured ranges such as consecutive squares). So naturally I'm trying to run this for some pretty large numbers.

Unfortunately the below code is not running fast enough for it to be even practical for me to use. I think the flaw is Haskell's laziness policy causing issues, but I neither know the syntax nor the place where I need to force execution to prevent the operations from accumulating.

subsets [] = [[]]
subsets (x:xs) = subsets xs ++ map (x:) (subsets xs)

divides a b = (mod a b == 0)

coprime a b = ((gcd a b) == 1)

mutually_coprime []  = True
mutually_coprime (x:xs) | (coprime_list x xs) = mutually_coprime xs
                        | otherwise = False

coprime_list _ [] = True
coprime_list a (x:xs) | (coprime a x) = coprime_list a xs
                      | otherwise = False

coprime_subsets a b = coprime_subsets_helper (subsets [a..b])

coprime_subsets_helper [] = []
coprime_subsets_helper (x:xs) | (mutually_coprime x) = [x] ++ (coprime_subsets_helper xs)
                              | otherwise = coprime_subsets_helper xs

coprime_subset_length a b = max_element (map list_length (coprime_subsets a b))

list_length [] = 0
list_length (x:xs) = 1 + list_length xs

max_element a = max_element_helper a 0

max_element_helper [] a = a
max_element_helper (x:xs) a | (x > a) = max_element_helper xs x
                            | otherwise = max_element_helper xs a

Just to make it clear what sort of input this is hanging upon, "coprime_subsets 100 120" never seems to even halt for me. I actually left it running, got up, did some other stuff and came back to it later. It was still running. I suspect that a large bottleneck is computing all the subsets immediately. However, I don't want to put an artificial lower bound on the subsets generated. That might sieve out all of the coprime sets leaving me with nothing.

What I have tried so far:

  • I replaced the original coprime function I had with gcd. Originally this used modulo and iterative checking for all integers. I presume gcd used something like Euclid's Algorithm which should run faster, in theory.

  • I've been trying to think of a way to build the generation of the subsets into the coprime set generator. So far I haven't been able to come up with anything. I'm also not sure if that would actually help with anything.

  • I've been looking for anywhere that Haskell's laziness policy might be hurting me. Nothing stands out, but I'm sure.

I'm also aware that this could be an efficiency issue with the environment I use (winhugs). I'd say that is the issue; however, when I asked how to determine the largest subset (for general n sized ranges) on Math Stack Exchange the answers I received indicated that computationally speaking this is a very slow thing to calculate. If so, that's fine. I was just hoping to maybe be able to run through some of the ranges I was interested in without it taking almost forever.

I know efficiency is generally not allowed on this site; however, I have tried quite a bit and I'm not just trying to be lazy here. I know Haskell has quite a few weird quirks that can force it into being deceptively inefficient. It's been a while since I've coded in it, and I suspect I've landed in one of those quirks.

like image 244
user64742 Avatar asked Jan 30 '23 19:01

user64742


2 Answers

First, we must find out what is slow.

When working with numbers, one should use the fastest representation that provides the required accuracy and range. Your code doesn't have top-level type signatures, so GHC infers the type signature for coprime_subsets as

coprime_subsets :: Integral a => a -> a -> [[a]]

This allows GHC to choose an Integral for you and it will happily choose Integer, which is much slower than Int. With Integer, the program spends a huge amount of time just computing gcds. Forcing GHC to use Int takes the run time down from 6 seconds to 1 second, since GHC can do integer math on machine integers directly.

NB: It is also a good practice to always provide top-level type signatures. Even when the compiler doesn't benefit from them, humans often do.

Now, on to the meat of the problem. Running

main = print . length $ coprime_subsets (100 :: Int) 120
main :: IO ()

with profiling enabled (stack build --profile for Stack) and passing +RTS -p -h to the executable (-p for time and -h for space) gives us a breakdown:

COST CENTRE            MODULE  SRC                            %time %alloc

subsets                Coprime src/Coprime.hs:(4,1)-(5,52)     52.5  100.0
coprime                Coprime src/Coprime.hs:11:1-26          25.5    0.0
coprime_list           Coprime src/Coprime.hs:(19,1)-(21,41)   18.5    0.0
coprime_subsets_helper Coprime src/Coprime.hs:(27,1)-(29,69)    1.8    0.0
mutually_coprime       Coprime src/Coprime.hs:(14,1)-(16,43)    1.7    0.0

When we were using Integer, the vast majority (~78%) of the time was spent in the coprime test. The majority is now spent constructing the powerset, so let's look there first.

Next, we must understand why it is slow.

There are generally three strategies to speed something up:

  1. Improve its asymptotic complexity.
  2. Improve its constant factors.
  3. Call it fewer times.

Which of these might apply to subsets? (1) is an obvious choice. Constructing the powerset is O(2^n), so any asymptotic improvements here would be very helpful indeed. Can we find any? Theoretically, we should be able to. As Daniel suggests, this problem is equivalent to the maximal cliques problem, which is also exponential. However, maximal cliques has a solution with a smaller base, which means we should be able to find an asymptotic improvement for this problem as well.

The key insight to reducing its asymptotic complexity (and also the number of times we call it) is that we are generating the vast majority of subsets just to throw them away later on when we check them for coprimality. If we can avoid generating bad subsets initially, we will generate fewer subsets and perform fewer checks for coprimality. If the number of pruned results is proportional to the size of the entire computation tree, this will yield an asymptotic improvement. This sort of pruning is common in functional algorithm optimization; you can see an instructive example in Richard Bird's sudoku solver. In fact, if we can write a function that only generates non-mutually-coprime subsets, we will have solved the entire problem!

Finally, we are ready to fix it!

We can do this by modifying the original generator subsets to filter out non-coprime terms as it goes:

coprimeSubsets [] = [[]]
coprimeSubsets (x:xs)
  = coprimeSubsets xs ++ map (x :) (coprimeSubsets (filter (coprime x) xs))

(There is probably some clever fold we can use here if we think about it really hard but the explicit recursion is fine too.)

Once we do this, we can find coprime subsets of [100..120] in ~0.1 seconds, an order of magnitude improvement. The cost centre report tells the story:

COST CENTRE    MODULE          SRC                            %time %alloc

MAIN           MAIN            <built-in>                      42.9    0.5
coprimeSubsets Coprime         src/Coprime.hs:(33,1)-(35,75)   28.6   67.4
CAF            GHC.IO.Encoding <entire-module>                 14.3    0.1
coprime        Coprime         src/Coprime.hs:13:1-26          14.3   31.1

Now we actually spend most of our time doing IO and such. Furthermore, the number of calls to our coprime test is now only 3,848, an improvement of several orders of magnitude. We can also find the coprime subsets of [100..150] in 3 seconds, compared to... well, I didn't wait for it to finish, but at least a few minutes.

The next place to look for speedups might be in memoizing the coprime function, since this problem involves computing it for the same arguments many times.

like image 134
Rein Henrichs Avatar answered Apr 27 '23 00:04

Rein Henrichs


I suggest two main changes:

  1. Compute the gcd just once for each pair of numbers, rather than repeatedly.
  2. Only extend "acceptable" subsets, rather than constructing all subsets up front.

(Neither suggestion is related to laziness at all; I think that guess of yours was a red herring.) Below I implement these two suggestions; it turned out quite succinct, I think:

coprime_subsets' [] = [[]]
coprime_subsets' (x:xs)
    = coprime_subsets' xs
    ++ map (x:) (coprime_subsets' (filter ((1==) . gcd x) xs))

We can check that this computes the same answers in ghci:

> coprime_subsets 10 20 == coprime_subsets' [10..20]
True
> coprime_subsets 100 120 == coprime_subsets' [100..120]
True

Apparently my computer is much faster than yours: coprime_subsets 100 120 completes in just under 16 seconds here even in ghci. Of course, my version takes 0.02 seconds even in ghci, so... =)

If you only care about maximal coprime subsets, you can make it faster still. The main change here is the addition of the filter in front of the recursion:

maximal_coprime_subsets [] = [[]]
maximal_coprime_subsets (x:xs)
    = filter (any ((>1) . gcd x)) (maximal_coprime_subsets xs)
    ++ map (x:) (maximal_coprime_subsets (filter ((1==) . gcd x) xs))

Even though this one repeats gcds much more, it can do [100..120] in around 0.01 seconds even in ghci, so it's a fair factor faster than the previous implementation.

The difference in timing here appears to be that the output coprime_subsets' just took longer to print, ha!

like image 34
Daniel Wagner Avatar answered Apr 27 '23 00:04

Daniel Wagner