I would like to perform string search and replace with regular expressions but have been unable to find a function that does what I need. I'm looking for something similar to my mythical regexReplace whose potential signatures I give below:
regexReplace :: Regex -> (String -> String) -> String -> String
or
regexReplace :: Regex -> (RegexMatch -> String) -> String -> String
This is roughly equivalent to preg_replace_callback in PHP. It takes a compiled regular expression represented by Regex, a replacement function (whose parameter would either be String or some match type RegexMatch), a source string and returns the result.
What would be the closest match to this behaviour among the many Haskell regular expression libraries that are out there?
Update:
I may need the callback to run inside IO, so I may end up needing the following function signature:
regexReplace :: Monad m => Regex -> (RegexMatch -> m String) -> String -> m String
This is obviously turning into a far more general problem than the original problem statement. I may end up using Parsec or similar.
Basically, I have the following program and haven't been able to figure out how to fill in the blank:
{-# LANGUAGE RecordWildCards #-}
module Main where
import Data.Maybe
import System.Environment
data Regex = Regex {
regexPattern :: String
}
data RegexMatch = RegexMatch {
regexGroups :: [String]
}
regexReplace :: Monad m => Regex -> (RegexMatch -> m String) -> String -> m String
regexReplace Regex{..} f source =
-- What do I put here?
undefined
replacementValue :: [String] -> IO String
replacementValue groups =
case groups of
whole : name : defaultValue : [] -> do
lookupResult <- lookupEnv name
let value = fromMaybe defaultValue lookupResult
return value
otherwise -> error "Unexpected groups"
main :: IO ()
main = do
replaceResult <- regexReplace
(Regex "ENV:([A-Za-z_][A-Za-z_0-9]*):([A-Za-z_][A-Za-z_0-9]*)")
(\RegexMatch{..} -> replacementValue regexGroups)
"some input containing ENV:foo:bar replacement expressions"
putStrLn replaceResult
Conclusion:
Using regex-applicative as suggested by @DanielWagner, I came up with the solution below which is quite elegant:
{-# LANGUAGE RecordWildCards #-}
module Main (main) where
import Data.Char
import Data.Functor.Compose
import Data.Maybe
import System.Environment
import Text.Regex.Applicative
data EnvMatch = EnvMatch {
envMatchName :: String
, envMatchDefault :: String
}
-- Matches pattern "_env:SOME_NAME:SOME_DEFAULT_VALUE"
envPattern :: RE Char EnvMatch
envPattern = EnvMatch
<$ string "_env:"
<*> token
<* string ":"
<*> token
where
token :: RE Char String
token = many (psym $ \c -> isAlphaNum c || c == '_')
expandEnv :: EnvMatch -> IO String
expandEnv EnvMatch{..} = fmap (fromMaybe envMatchDefault) (lookupEnv envMatchName)
queryTransform :: RE Char (IO String)
queryTransform = getCompose . (concat <$>) . sequenceA . map Compose $
[
pure <$> many anySym
, expandEnv <$> envPattern
, pure <$> many anySym
]
runQueryTransform :: String -> IO (Maybe String)
runQueryTransform = sequenceA . match queryTransform
main :: IO ()
main = do
result <- runQueryTransform "hello\"_env:HOME:default_home_dir\"world"
print result
-- Yields: Just "hello\"/home/user\"world"
Thanks, @DanielWagner!
You may like regex-applicative, which offers:
match :: RE Char String -> String -> Maybe String
You can replace specific parts of the match in the code that builds the value of type RE Char String. For example, here is a function which finds a string of a and b characters and reverses them:
import Text.Regex.Applicative
asAndBs = many (psym ( `elem` "ab"))
noAsAndBs = many (psym (`notElem` "ab"))
transformation = concat <$> sequenceA [noAsAndBs, reverse <$> asAndBs, noAsAndBs]
Some example runs in ghci:
> match transformation "ntoheuuaaababbboenuth"
Just "ntoheuubbbabaaaoenuth"
> match transformation "aoesnuthaosneut"
Nothing
To handle your updated question: here is a transformation which looks for a string of a and b characters and asks the user what to replace them with. It reuses asAndBs and noAsAndBs from before, only modifying the transformation applied to them. I also include an example driver of queryTransform just to show how it might be used. The basic idea is to build up, rather than a flat replacement string, an IO action which produces the replacement string. It is then the job of the consumer who calls match to execute that IO action as appropriate.
import Data.Functor.Compose
queryTransform = getCompose . (concat <$>) . sequenceA . map Compose $
[ pure <$> noAsAndBs
, getLine <$ asAndBs
, pure <$> noAsAndBs
]
runQueryTransform = getLine >>= sequenceA . match queryTransform
I hope you recognize the parallels between the queryTransform structure and the transformation structure from before (in particular note that the (concat <$>) . sequenceA construct is just like before). Here's some examples in ghci:
> runQueryTransform
oeunthaaabbbaboenuth
replacement
Just "oeunthreplacementoenuth"
> runQueryTransform
aoeunthaoeunth
Nothing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With