Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a common/standard subset of Regular Expressions?

Tags:

java

c#

regex

ruby

Do the "control characters" used in regular expressions differ a lot among different implementations of regex parsers (eg. regex in Ruby, Java, C#, sed etc.).

For example, in Ruby, the \D means not a digit; does it mean the same in Java, C# and sed? I guess what I'm asking is, is there a "standard" for regex'es that all regex parsers support?

If not, is there some common subset that should be learned and mastered (and then learn the parser-specific ones as they're encountered) ?

like image 764
Zabba Avatar asked Apr 26 '11 21:04

Zabba


2 Answers

See the list of basic syntax on regular-expressions.info.

And a comparison of the different "flavors".

like image 120
Oded Avatar answered Oct 05 '22 22:10

Oded


There is a common core which is very simple. It corresponds to the regular expressions as implemented in the original software tools such as ed, grep, sed, and awk. This is worth learning, because the other formats are all supersets of this one.

.        match any character
[abc]    match a, b, or c
[^abc]   match a character other than a, b, or c
[a-c]    match the range from a to c
^        match the begininning of the line
$        match the end of the line
*        match zero or more of the preceding character
\(...\)  group for use as a back-reference 

† I've left out Posix bracket expressions because no one uses them and they aren't in the subset. The parens are by default magic except in the classic expressions.

like image 45
DigitalRoss Avatar answered Oct 05 '22 22:10

DigitalRoss