Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whether to escape ( and ) in regex using GNU sed

Tags:

regex

bash

sed

gnu

I've noticed several posts on this site which say that with gnu sed you should use ( and ) in regex rather than \( and \). But then I looked in the gnu sed manual and saw that they specify that \( and \) must be used. What's up?

like image 685
grok12 Avatar asked Jun 17 '11 15:06

grok12


People also ask

What type of regex does sed use?

As Avinash Raj has pointed out, sed uses basic regular expression (BRE) syntax by default, (which requires ( , ) , { , } to be preceded by \ to activate its special meaning), and -r option switches over to extended regular expression (ERE) syntax, which treats ( , ) , { , } as special without preceding \ .

What are the special characters in sed?

The special character in sed are the same as those in grep, with one key difference: the forward slash / is a special character in sed. The reason for this will become very clear when studying sed commands.


3 Answers

This part of the gnu sed manual you linked to explains that whether you should escape parentheses depends on whether you are using basic regular expressions or extended regular expressions. This part says that the -r flag determines what mode you are in.

Edit: as stated in grok12's comment, the -E flag in bsd sed does what the -r flag does in gnu sed.

like image 61
murgatroid99 Avatar answered Oct 11 '22 10:10

murgatroid99


Originally sed, like grep and everything else, used \( to indicate grouping, whereas ( just matched a literal open-paren.

Many newer implementations of regular expressions, including egrep and perl, switched this around, so \( meant a literal open-paren, and ( was used to specify grouping.

So now with gnu sed, ( is a special character; just like egrep. But on other systems (e.g. BSD) it's still the old way, as far as I can tell. Unfortunately this is a real mess, because now it's hard to know which one to use.

like image 36
chrisdowney Avatar answered Oct 11 '22 11:10

chrisdowney


Thanks to rocker, murga, and chris. Each of you helped me understand the issue. I'm answering my own question here in order to (hopefully) put the whole story together in one place.

There are two major versions of sed in use: gnu and bsd. Both of them require parens in basic regex to be escaped when used for grouping but not escaped when used in extended regex. They diff in that the -r option enables extended regex for gnu but -E does so for bsd.

The standard sed in mac OSX is bsd. I believe much of the rest of the world uses gnu sed as the standard but I don't know precisely who uses what. If you are unsure which you are using try:

> sed -r

If you get a

> sed: illegal option -- r

reply then you have bsd.

like image 31
grok12 Avatar answered Oct 11 '22 10:10

grok12