I am trying to write a function that takes a list of searchwords, a list of replacementwords and a string on which those will be used.
listReplace :: [String] -> [String] -> String -> String
The tricky part is that if the fitting searchword is the n'th, then the n'th replacementword should be used. Also, when a replacementword has been used it should not be replaced by a different replacementword if it is actually a searchword itself. I've already written these kind of functions for
replace :: String -> String -> String -> String:
replace x y [] = []
replace x y (z:zs) = if isPrefixOf x (z:zs) then y ++ replace x y (drop (length x) (z:zs)) else z: (replace x y zs)
and
replace' :: String -> [String] -> String -> String:
replace' x y [] = []
replace' x [] (z:zs) = []
replace' x y (z:zs) = if isPrefixOf x (z:zs) then concat (take 1 y) ++ replace' x (drop 1 y) (drop (length x) (z:zs)) else z: (replace' x y zs)
I just dont know how to begin with this replaceList function tho, the only thing that might actually be useful that I've found so far is a function that replaces the n'th element in a list. But I cant seem to figure out how to put it to use in this case:
replace :: Int -> a -> [a] -> [a]
replace n a [] = []
replace 0 a (x:xs) = a : xs
replace n a (x:xs) = x : replace (n-1) a xs
well hopefully one of you can help me out! Thanks in advance :)
I would suggest a different type than
listReplace :: [String] -> [String] -> String -> String
What would happen if one called
listReplace ["one", "two"] ["een"] "I have two problems"
the substring "two" would be found, but there's no replacement for it provided.
Rather use
listReplace :: [(String, String)] -> String -> String
so that you are guaranteed that there are always exactly as many replacement strings as patterns you search for.
A simple implementation could then use
find :: (a -> Bool) -> [a] -> Maybe a
from Data.List
to check if any of the provided patterns is a prefix of the remaining input,
listReplace _ "" = ""
listReplace replacements string@(c:cs)
= case find ((`isPrefixOf` string) . fst) replacements of
Just (pat,rep) -> rep ++ listReplace replacements (drop (length pat) string)
Nothing -> c : listReplace replacements cs
This easy solution is not very efficient - that would require a more complicated algorithm - and it doesn't detect if one of the patterns to be replaced is a prefix of another, so that if the shorter pattern comes before the longer in the list, the longer pattern would never be used, even if it should be. That can be dealt with by sorting the list of replacements, for example descending by lexicographical order, before calling the replace function.
My suggestion would be to use a somewhat different intermediate datastructure when processing the string that you want edited. Here is a solution that uses tries.
import Data.Map (Map)
import qualified Data.Map as M
Here is a simple datatype of tries:
data Trie = Trie (Maybe String) (Map Char Trie)
Tries are constructed from the empty trie and a function for inserting key/value bindings into an existing trie:
empty :: Trie
empty = Trie Nothing M.empty
insert :: String -> String -> Trie -> Trie
insert [] val (Trie _ tries) = Trie (Just val) tries
insert (c : cs) val (Trie mbVal tries) = case M.lookup c tries of
Nothing -> Trie mbVal (M.insert c (insert cs val empty) tries)
Just trie -> Trie mbVal (M.insert c (insert cs val trie) tries)
With tries, matching reduces to recursing over the input string while traversing the trie. When a match is found, the corresponding replacement value is returned together with the remaining part of the input string (so that it can be subjected to further replacements):
match :: Trie -> String -> Maybe (String, String)
match (Trie (Just val) _ ) s = Just (val, s)
match (Trie Nothing _ ) [] = Nothing
match (Trie Nothing tries) (c : cs) = case M.lookup c tries of
Nothing -> Nothing
Just trie -> match trie cs
Note that this function is greedy in the sense that it gives preference to the shortest match if multiple matches are possible. Adapting it so that it picks the longest match instead (and, hence, implements the "maximal-munch" principle) is not too hard.
Replacing occurrences of search words by their corresponding replacements can be implemented by looking for a match in the input string: if a match is found, the replacement is put into the output string and we continue processing with the unmatched part of the string. If no match is found, we keep the head of the input string and proceed with the tail.
replace :: Trie -> String -> String
replace trie = go
where
go [] = []
go s@(c : cs) = case match trie s of
Nothing -> c : go cs
Just (s', s'') -> s' ++ go s''
Your required function listReplace
is now almost trivial:
listReplace :: [String] -> [String] -> String -> String
listReplace keys vals = replace trie
where
trie = foldr ($) empty (zipWith insert keys vals)
As you see, the part that you refer to as "tricky" is easily realised by "zipping" the two list arguments.
Here is a simple example (inspired by L. Peter Deutsch):
> let s = "to err is human; to forgive, divine"
> listReplace ["err", "forgive"] ["iterate", "recurse"] s
"to iterate is human; to recurse, divine"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With