I implemented a version of this answer https://stackoverflow.com/a/9920425/1261166 (I don't know what was intended by the person answering)
sublistofsize 0 _ = [[]]
sublistofsize _ [] = []
sublistofsize n (x : xs) = sublistsThatStartWithX ++ sublistsThatDontStartWithX
where sublistsThatStartWithX = map (x:) $ sublistofsize (n-1) xs
sublistsThatDontStartWithX = sublistofsize n xs
what I'm unsure of is sublistsThatStartWithX = map (x:) $ sublistofsize (n-1) xs
I assume that map (x:) gives a problem performance wise, but not sure of how to solve it. I've done profiling on print $ length $ sublistofsize 5 $ primesToTakeFrom 50
COST CENTRE MODULE no. entries %time %alloc %time %alloc
sublistofsize Main 112 4739871 46.9 39.9 96.9 100.0
sublistofsize.sublistsThatDontStartWithX Main 124 2369935 2.2 0.0 2.2 0.0
sublistofsize.sublistsThatStartWithX Main 116 2369935 47.8 60.1 47.8 60.1
Did I implement it in a good way? Are there any faster ways of doing it?
I assume that map (x:) gives a problem performance wise
No. map
is coded efficiently and runs in linear time, no problems here.
However, your recursion might be a problem. You're both calling sublistofsize (n-1) xs
and sublistofsize n xs
, which - given a start list sublistofsize m (_:_:ys)
- does evaluate the term sublistofsize (m-1) ys
twice, as there is no sharing between them in the different recursive steps.
So I'd apply dynamic programming to get
subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
in if n>l then [] else subsequencesBySize xs !! (l-n)
where
subsequencesBySize [] = [[[]]]
subsequencesBySize (x:xs) = let next = subsequencesBySize xs
in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])
Not that appending the empty lists is the most beautiful solution, but you can see how I have used zipWith
with the displaced lists so that the results from next
are used twice - once directly in the list of subsequences of length n and once in the list of subsequences of length n+1.
Testing it in GHCI with :set +s
, you can see how this is drastically faster than the naive solutions:
*Main> length $ subsequencesOfSize 7 [1..25]
480700
(0.25 secs, 74132648 bytes)
(0.28 secs, 73524928 bytes)
(0.30 secs, 73529004 bytes)
*Main> length $ sublistofsize 7 [1..25] -- @Vixen (question)
480700
(3.03 secs, 470779436 bytes)
(3.35 secs, 470602932 bytes)
(3.14 secs, 470747656 bytes)
*Main> length $ sublistofsize' 7 [1..25] -- @Ganesh
480700
(2.00 secs, 193610388 bytes)
(2.00 secs, 193681472 bytes)
*Main> length $ subseq 7 [1..25] -- @user5402
480700
(3.07 secs, 485941092 bytes)
(3.07 secs, 486279608 bytes)
Your implementation is the natural "Haskell-ish" one for that problem.
If you end up using the entire result, then there won't be anything asymptotically faster for this problem given the output datastructure ([[a]]
) because it runs in time linear in the length of the output.
The use of map (x:)
is a very natural way to add an element onto the start of each list and there's unlikely to be any significantly faster options given that we are working with lists.
In principle the repeated use of (++)
is inefficient as it causes the left-hand argument to be traversed each time it is called, but the total cost in this case should only be an extra constant factor.
You might be able to improve it with the use of an accumulating parameter otherResults
to collect the results, but to make this change you also need to pass down prefix
in reversed order and re-reverse it at the end, which could well eat up the savings:
sublistofsize' 0 _ prefix otherResults = reverse prefix : otherResults
sublistofsize' _ [] prefix otherResults = otherResults
sublistofsize' n (x : xs) prefix otherResults =
sublistofsize' (n-1) xs (x:prefix) (sublistofsize' n xs prefix otherResults)
sublistofsize n xs = sublistofsize' n xs [] []
An optimization which should help is to keep track of whether there are enough elements in the list to form the rest of the subsequence. This can be done very efficiently by keeping track of a pointer which is n-1
-elements ahead of xs
and advancing them both as you recurse.
An implementation:
nthtail 0 xs = xs
nthtail _ [] = []
nthtail n (x:xs) = nthtail (n-1) xs
subseq 0 _ = [[]]
subseq n xs =
if null t
then []
else go n xs t
where
t = nthtail (n-1) xs -- n should always be >= 1 here
go 0 _ _ = [[]]
go _ _ [] = []
go n xs@(x:xt) t = withx ++ withoutx
where withx = map (x:) $ go (n-1) xt t
withoutx = go n xt (tail t)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With