Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to escape characters in Haskell's Text.Regex library?

Introduction

I'm using Haskell's Text.Regex library and I want to match some characters that normally have meaning in regular expressions. According to Text.Regex's documentation,

The syntax of regular expressions is ... that of egrep (i.e. POSIX "extended" regular expressions).

And apparently, escaping in POSIX Extended Regular Expressions (ERE) uses backslashes [unlike POSIX Basic Regular Expressions (BRE)].


Problem

However, when I try to do something like this:

> import Text.Regex
> matchRegex (mkRegex "\*") "*"

I get the following error:

<interactive>:1:23:
    lexical error in string/character literal at character '*'

The same thing happens no matter what character I put after the \.


Work-Around

I could do something like this:

> matchRegex (mkRegex "[*]") "*"
Just []

which works, but it seems like a hack, especially if I want to escape several things in a row (e.g. mkRegex "[[][(][)][]]" which matches [()]).


Question

Is this the only way to escape in POSIX ERE? Why doesn't Haskell's Text.Regex library support \ escaping like it seems it ought to?

like image 382
Elliot Cameron Avatar asked Dec 27 '22 12:12

Elliot Cameron


2 Answers

I don't know the syntax but usually if you want to write back-slash inside a string you need to escape it, meaning:

matchRegex (mkRegex "\\*") "*"

Does it help?

like image 72
Udi Cohen Avatar answered Jan 14 '23 01:01

Udi Cohen


Try it with two backslashes:

matchRegex (mkRegex "\\*") "*"

I just tried that with GHCI and it worked.

like image 44
Poindexter Avatar answered Jan 14 '23 01:01

Poindexter