Recently I have been trying to create in Haskell a regex interpretor. What I did was create a new data type with all possible constructors (for sequence, *
, ^
, intervals, etc) and then define a matcher function. It works wonders but my problem is that I have to convert the input (the String, for example "a(b*)(c|d)ef"
) to my data type ("Seq (Sym a) (Seq (Rep Sym b) (Seq (Or Sym c Sym d) Sym ef))"
). I am having trouble with this part of the problem (I tried creating a new data type, a parsing tree, but I failed completely). Any ideas on how I could solve it?
The canonical approach is to use a parser combinator library, such as Parsec. Parser combinator libraries (like parser generators) let you write descriptions of your grammar, yielding a parser from strings to tokens in that language.
You simply have to encode your grammar as a Parsec function.
As an example, see this previous SO question: Using Parsec to parse regular expressions
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With