Real World Haskell, chapter 4, page 98 of the printed version asks if words
can be implemented using folds, and this is my question too:
Is it possible? If not, why? If it is, how?
I came up with the following, which is based on the idea that each non-space should be prepended to the last word in the output list (this happens in the otherwise
guard), and that a space should trigger the appending of an emtpy word to the output list if there is not one already (this is handled in the if
-then
-else
).
myWords :: String -> [String]
myWords = foldr step [[]]
where
step x yss@(y:ys)
| x == ' ' = if y == "" then yss else "":yss
| otherwise = (x:y):ys
Clearly this solution is wrong, since leading spaces in the input string result in one leading empty string in the output list of strings.
At the link above, I've looked into several of the proposed solutions for other readers, and many of them work similarly to my solution, but they generally "post-process" the output of the fold, for instance by tail
ing it if there is an empty leading string.
Other approaches use tuples (actually just pairs), so that the fold deals with the pair and can well handle the leading/trailing spaces.
In all these approaches, foldr
(or another fold, fwiw) is not the function that provides the final output out of the box; there's always something else with has to adjust the output somehow.
Therefore I go back to the initial question and ask if it is actually possible to implement words
(in a way that it correctly handles trailing/leading/repeated spaces) using folds. By using folds I mean that the folding function has to be the outermost function:
myWords :: String -> [String]
myWords input = foldr step seed input
If I understand correctly, your requirements include
(1) words "a b c" == words " a b c" == ["a", "b", "c"]
(2) words "xa b c" == ["xa", "b", "c"] /= ["x", "a", "b", "c"] == words "x a b c"
This implies that we can not have
words = foldr step base
for any step
and base
.
Indeed, if we had that, then
words "xa b c"
= def words and foldr
step 'x' (words "a b c")
= (1)
step 'x' (words " a b c")
= def words and foldr
words "x a b c"
and this contradicts (2).
You definitely need some post-processing after the foldr
.
@chi has a wonderful argument that you cannot implement words
using "a" fold, but you did say using folds.
words = filterNull . words1
where
filterNull = foldr (\xs -> if null xs then id else (xs:)) []
words1 = foldr (\c -> if c == ' ' then ([]:) else consHead c) []
consHead c [] = [[c]]
consHead c (xs:xss) = (c:xs):xss
Both the outermost and innermost function are folds. ;-)
Yes. Eventhough it's a little tricky you may still do this job properly by using a single foldr
and nothing else if you dwell into CPS (Continuation Passing Style). I had shown a special kind of chunksOf
function previously.
In this kinds of folds our accumulator, hence the result of the fold is a function and we have to apply it to an identity kind of input so that we have the final result. So this may count as a final processing stage or not since we are using a single fold here and the type of it includes the function. Open to debate :)
ws :: String -> [String]
ws str = foldr go sf str $ ""
where
sf :: String -> [String]
sf s = if s == " " then [""] else [s]
go :: Char -> (String -> [String]) -> (String -> [String])
go c f = \pc -> let (s:ss) = f [c]
in case pc of
"" -> dropWhile (== "") (s:ss)
otherwise -> case (pc == " ", s == "") of
(True, False) -> "":s:ss
(True, True) -> s:ss
otherwise -> (pc++s):ss
λ> ws " a b c "
["a","b","c"]
sf
: The initial function value to start with.
go
: The iterator function
We are actually not fully utilizing the power of the CPS here since we have both the previous character pc
and the currect character c
at hand in every turn. It was very useful in the chunksOf
function mentioned above while chunking a [Int]
into [[Int]]
every time an ascending sequence of elements were broken.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With