I needed a String tokenizer in Haskell but there is apparently nothing already defined in the Prelude or other modules. There is splitOn in Data.Text, but that's a pain to use because you need to wrap the String to Text.
The tokenizer is not too hard to do so I wrote one (it doesn't handle multiple adjacent delimiters, but it worked well for what I needed it). I feel something like this should be already in the modules somewhere..
This is my version
tokenizer :: Char -> String -> [String]
tokenizer delim str = tokHelper delim str []
tokHelper :: Char -> String -> [String] -> [String]
tokHelper d s acc
| null pos = reverse (pre:acc)
| otherwise = tokenizer d (tail pos) (pre:acc)
where (pre, pos) = span (/=d) s
I searched the internet for more solutions and found some discussions, like this blog post.
The last comment (by Mahee on June 10, 2011) is particularly interesting. Why not make a version of the words function more generic to handle this? I tried searching for such a function but found none..
Is there a simpler way to this or is 'tokenizing' a string not a very recurring problem? :)
The split library is what you need. Install with cabal install split
, then you have access to a lot of split/tokenizer style functions.
Some examples from the library:
> import Data.List.Split
> splitOn "x" "axbxc"
["a","b","c"]
> splitOn "x" "axbxcx"
["a","b","c",""]
> endBy ";" "foo;bar;baz;"
["foo","bar","baz"]
> splitWhen (<0) [1,3,-4,5,7,-9,0,2]
[[1,3],[5,7],[0,2]]
> splitOneOf ";.," "foo,bar;baz.glurk"
["foo","bar","baz","glurk"]
> splitEvery 3 ['a'..'z']
["abc","def","ghi","jkl","mno","pqr","stu","vwx","yz"]
The wordsBy
function from the same library is a generic version of words
like you wanted:
wordsBy (=='x') "dogxxxcatxbirdxx" == ["dog","cat","bird"]
If you're parsing a Haskell-like language you can use the lex
function from the Prelude: http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelude.html#v:lex
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With