Is there a common/standard subset of Regular Expressions?

Question

Do the "control characters" used in regular expressions differ a lot among different implementations of regex parsers (eg. regex in Ruby, Java, C#, sed etc.).

For example, in Ruby, the \D means not a digit; does it mean the same in Java, C# and sed? I guess what I'm asking is, is there a "standard" for regex'es that all regex parsers support?

If not, is there some common subset that should be learned and mastered (and then learn the parser-specific ones as they're encountered) ?

Oded · Accepted Answer

See the list of basic syntax on regular-expressions.info.

And a comparison of the different "flavors".

DigitalRoss · Answer

There is a common core which is very simple. It corresponds to the regular expressions as implemented in the original software tools such as ed, grep, sed, and awk. This is worth learning, because the other formats are all supersets of this one.^†

.        match any character
[abc]    match a, b, or c
[^abc]   match a character other than a, b, or c
[a-c]    match the range from a to c
^        match the begininning of the line
$        match the end of the line
*        match zero or more of the preceding character
$...$  group for use as a back-reference

^{† I've left out Posix bracket expressions because no one uses them and they aren't in the subset. The parens are by default magic except in the classic expressions.}

Is there a common/standard subset of Regular Expressions?

Tags:

java

c#

regex

ruby

Zabba

2 Answers

Oded

DigitalRoss

Recent Activity

Donate For Us

Is there a common/standard subset of Regular Expressions?

Tags:

java

c#

regex

ruby

Zabba

2 Answers

Oded

DigitalRoss

Related questions

Recent Activity

Donate For Us