Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find all capturing groups of a regular expression

Tags:

regex

haskell

I am looking for a Haskell function that returns the capturing groups of all matches of a given regex.

I have been looking at Text.Regex, but couldn't find anything there.

Now I am using this workaround which seems to work:

import Text.Regex

findNext :: String -> Maybe (String, String, String, [String] ) -> [ [String] ]
findNext pattern Nothing = []
findNext pattern (Just (_, _, rest, matches) ) = 
    case matches of
        [] -> (findNext pattern res)
        _ -> [matches] ++ (findNext pattern res)
    where res = matchRegexAll (mkRegex pattern) rest

findAll :: String -> String -> [ [String] ]
findAll pattern str = findNext pattern (Just ("", "", str, [] ) )

Result:

findAll "x(.)x(.)" "aaaxAxaaaxBxaaaxCx"
[["A","a"],["B","a"]]

Question:

  • Did I miss something in Text.Regex?
  • Is there a Haskell regex library that implements a findAll function?
like image 716
Hyperboreus Avatar asked Jul 18 '11 06:07

Hyperboreus


1 Answers

You can use the =~ operator from Text.Regex.Posix:

Prelude> :mod + Text.Regex.Posix
Prelude Text.Regex.Posix> "aaaxAxaaaxBxaaaxCx" =~ "x(.)x(.)" :: [[String]]
[["xAxa","A","a"],["xBxa","B","a"]]

Note the explicit [[String]] type. Try replacing it with Bool, Int, String and see what happens. All types that you can use in this context are listed here. Also see this tutorial.

like image 172
Mikhail Glushenkov Avatar answered Oct 28 '22 21:10

Mikhail Glushenkov