Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression replace with callback

Tags:

regex

haskell

I would like to perform string search and replace with regular expressions but have been unable to find a function that does what I need. I'm looking for something similar to my mythical regexReplace whose potential signatures I give below:

regexReplace :: Regex -> (String -> String) -> String -> String

or

regexReplace :: Regex -> (RegexMatch -> String) -> String -> String

This is roughly equivalent to preg_replace_callback in PHP. It takes a compiled regular expression represented by Regex, a replacement function (whose parameter would either be String or some match type RegexMatch), a source string and returns the result.

What would be the closest match to this behaviour among the many Haskell regular expression libraries that are out there?

Update:

I may need the callback to run inside IO, so I may end up needing the following function signature:

regexReplace :: Monad m => Regex -> (RegexMatch -> m String) -> String -> m String

This is obviously turning into a far more general problem than the original problem statement. I may end up using Parsec or similar.

Basically, I have the following program and haven't been able to figure out how to fill in the blank:

{-# LANGUAGE RecordWildCards #-}

module Main where

import Data.Maybe
import System.Environment

data Regex = Regex {
  regexPattern :: String
}
data RegexMatch = RegexMatch {
  regexGroups :: [String]
}

regexReplace :: Monad m => Regex -> (RegexMatch -> m String) -> String -> m String
regexReplace Regex{..} f source =
  -- What do I put here?
  undefined

replacementValue :: [String] -> IO String
replacementValue groups =
  case groups of
    whole : name : defaultValue : [] -> do
      lookupResult <- lookupEnv name
      let value = fromMaybe defaultValue lookupResult
      return value
    otherwise -> error "Unexpected groups"

main :: IO ()
main = do
  replaceResult <- regexReplace
    (Regex "ENV:([A-Za-z_][A-Za-z_0-9]*):([A-Za-z_][A-Za-z_0-9]*)")
    (\RegexMatch{..} -> replacementValue regexGroups)
    "some input containing ENV:foo:bar replacement expressions"
  putStrLn replaceResult

Conclusion:

Using regex-applicative as suggested by @DanielWagner, I came up with the solution below which is quite elegant:

{-# LANGUAGE RecordWildCards #-}

module Main (main) where

import Data.Char
import Data.Functor.Compose
import Data.Maybe
import System.Environment
import Text.Regex.Applicative

data EnvMatch = EnvMatch {
    envMatchName :: String
  , envMatchDefault :: String
}

-- Matches pattern "_env:SOME_NAME:SOME_DEFAULT_VALUE"
envPattern :: RE Char EnvMatch
envPattern = EnvMatch
    <$ string "_env:"
    <*> token
    <* string ":"
    <*> token
    where
        token :: RE Char String
        token = many (psym $ \c -> isAlphaNum c || c == '_')

expandEnv :: EnvMatch -> IO String
expandEnv EnvMatch{..} = fmap (fromMaybe envMatchDefault) (lookupEnv envMatchName)

queryTransform :: RE Char (IO String)
queryTransform = getCompose . (concat <$>) . sequenceA . map Compose $
    [
        pure <$> many anySym
      , expandEnv <$> envPattern
      , pure <$> many anySym
    ]

runQueryTransform :: String -> IO (Maybe String)
runQueryTransform = sequenceA . match queryTransform

main :: IO ()
main = do
  result <- runQueryTransform "hello\"_env:HOME:default_home_dir\"world"
  print result
  -- Yields: Just "hello\"/home/user\"world"

Thanks, @DanielWagner!

like image 396
Richard Cook Avatar asked Dec 04 '25 11:12

Richard Cook


1 Answers

You may like regex-applicative, which offers:

match :: RE Char String -> String -> Maybe String

You can replace specific parts of the match in the code that builds the value of type RE Char String. For example, here is a function which finds a string of a and b characters and reverses them:

import Text.Regex.Applicative
asAndBs   = many (psym (   `elem` "ab"))
noAsAndBs = many (psym (`notElem` "ab"))
transformation = concat <$> sequenceA [noAsAndBs, reverse <$> asAndBs, noAsAndBs]

Some example runs in ghci:

> match transformation "ntoheuuaaababbboenuth"
Just "ntoheuubbbabaaaoenuth"
> match transformation "aoesnuthaosneut"
Nothing

To handle your updated question: here is a transformation which looks for a string of a and b characters and asks the user what to replace them with. It reuses asAndBs and noAsAndBs from before, only modifying the transformation applied to them. I also include an example driver of queryTransform just to show how it might be used. The basic idea is to build up, rather than a flat replacement string, an IO action which produces the replacement string. It is then the job of the consumer who calls match to execute that IO action as appropriate.

import Data.Functor.Compose
queryTransform = getCompose . (concat <$>) . sequenceA . map Compose $
    [ pure <$> noAsAndBs
    , getLine <$ asAndBs
    , pure <$> noAsAndBs
    ]
runQueryTransform = getLine >>= sequenceA . match queryTransform

I hope you recognize the parallels between the queryTransform structure and the transformation structure from before (in particular note that the (concat <$>) . sequenceA construct is just like before). Here's some examples in ghci:

> runQueryTransform
oeunthaaabbbaboenuth
replacement
Just "oeunthreplacementoenuth"
> runQueryTransform
aoeunthaoeunth
Nothing
like image 184
Daniel Wagner Avatar answered Dec 07 '25 01:12

Daniel Wagner



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!