Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way for substring replace in haskell

The problem is quite simple: I have to replace all occurences of "fooo" and all its substrings with "xyz". In Java, for example, I will do it like this:

someString.replaceAll( "fooo|foo|fo", "xyz" )

and it will do the trick. But in Haskell I've found no efficient way to work with regex. First of all, I've read this: http://www.haskell.org/haskellwiki/Regular_expressions

The only library which actually has replace function is regex-posix, but its considered "very slow" in performance. And this fact is not acceptable. Also I've found that this replace function for any reasons doesn't respect the order of patterns given, so I've got output like this:

>replace "boo fooo boo" "xyz"
"boo xyzoo boo"

Other backends don't imply such functionality.

So I decided to write simple workaround:

replaceFoo input =
    helper input []
    where
        helper ('f':'o':'o':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper ('f':'o':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper ('f':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper (x:xs) ys = helper xs (x:ys)
        helper [] ys = reverse ys

Whilst I don't find this function nice, it works well and fast. But for now I met the necessity to add more words in this replacor, and I don't like the idea to extend helper patterns anymore (I need to say that I actually have 4 words in it in real app and that's odd).

I'll be happy if someone help me with fast solution.


cebewee, thanks for the Data.String.Utils. But I fear this approach is quite slow if there are many words to replace ("fooo" to "xyz", "foo" to "xyz", "fo" to "xyz", "bar" to "quux" and so on), because to get that to work I will need to foldr (\str (from,to) -> replace from to str) input pairs or something like that and it will take O(n*n). More than that, it may have unexpected result of replacing substring of result of previous replacement.

like image 476
pechenie Avatar asked Mar 08 '11 09:03

pechenie


2 Answers

There is Data.String.Utils.replace in the MissingH package. If you only need plain substring replace (and not regular expressions), this might be what you need.

like image 174
Lars Noschinski Avatar answered Oct 26 '22 15:10

Lars Noschinski


The regex-xmlschema package has a sed function that might be what you're looking for:

http://hackage.haskell.org/package/regex-xmlschema-0.1.3

See particularly:

http://hackage.haskell.org/packages/archive/regex-xmlschema/0.1.3/doc/html/Text-Regex-XMLSchema-String.html#v:sed

There was a discussion of options for string rewriting on Haskell-Cafe last year:

http://www.haskell.org/pipermail/haskell-cafe/2010-May/077943.html

like image 21
stephen tetley Avatar answered Oct 26 '22 16:10

stephen tetley