Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting all Substrings with length 4 out of infinite list

I'm quite new to Haskell and I'm trying to solve the following problem:

I have a function, that produces an infinite list of strings with different lengths. But the number of strings of a certain length is restricted.

Now I want to extract all substrings of the list with a certain length n . Unfortunately I did a lot of research and tried a lot of stuff, but nothing worked for me.

I know that filter() won't work, as it checks every part of the lists and results in an infinite loop.

This is my function that generates the infinite list:

allStrings =  [ c : s | s <- "" : allStrings, c <- ['R', 'T', 'P']]

I've already tried this:

allStrings = [x | x <- [ c : s | s <- "" : allStrings, 
                  c <- ['R', 'T', 'P']], length x == 4] 

which didn't terminate.

Thanks for your help!

like image 561
T.Naz Avatar asked Jan 01 '23 14:01

T.Naz


2 Answers

This

allStrings4 = takeWhile ((== 4) . length) . 
                dropWhile ((< 4) . length) $ allStrings

does the trick.

It works because your (first) allStrings definition cleverly generates all strings containing 'R', 'T', and 'P' letters in productive manner, in the non-decreasing length order.

Instead of trying to cram it all into one definition, separate your concerns! Build a solution to the more general problem first (this is your allStrings definition), then use it to solve the more restricted problem. This will often be much simpler, especially with the lazy evaluation of Haskell.

We just need to take care that our streams are always productive, never stuck.

like image 130
Will Ness Avatar answered Jan 13 '23 13:01

Will Ness


The problem is that your filter makes it impossible to generate any solutions. In order to generate a string of length 4, you first will need to generate a string of length 3, since you each time prepend one character to it. In order to generate a list of length 3, it thus will need to generate strings of length 2, and so on, until the base case: an empty string.

It is not the filter itself that is the main problem, the problem is that you filter in such a way that emitting values is now impossible.

We can fix this by using a different list that will build strings, and filter that list like:

allStrings = filter ((==) 4 . length) vals
    where vals = [x | x <- [ c : s | s <- "" : vals, c <- "RTP"]]

This will emit all lists of length 4, and then get stuck in an infinite loop, since filter will keep searching for more strings, and fail to find these.

We can however do better, for example by using replicateM :: Monad m => Int -> m a -> m [a] here:

Prelude Control.Monad> replicateM 4 "RTP"
["RRRR","RRRT","RRRP","RRTR","RRTT","RRTP","RRPR","RRPT","RRPP","RTRR","RTRT","RTRP","RTTR","RTTT","RTTP","RTPR","RTPT","RTPP","RPRR","RPRT","RPRP","RPTR","RPTT","RPTP","RPPR","RPPT","RPPP","TRRR","TRRT","TRRP","TRTR","TRTT","TRTP","TRPR","TRPT","TRPP","TTRR","TTRT","TTRP","TTTR","TTTT","TTTP","TTPR","TTPT","TTPP","TPRR","TPRT","TPRP","TPTR","TPTT","TPTP","TPPR","TPPT","TPPP","PRRR","PRRT","PRRP","PRTR","PRTT","PRTP","PRPR","PRPT","PRPP","PTRR","PTRT","PTRP","PTTR","PTTT","PTTP","PTPR","PTPT","PTPP","PPRR","PPRT","PPRP","PPTR","PPTT","PPTP","PPPR","PPPT","PPPP"]

Note that here the last character first changes when we generate the next string. I leave it as an exercise to obtain the reversed result.

like image 41
Willem Van Onsem Avatar answered Jan 13 '23 13:01

Willem Van Onsem